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Field of the invention 



This invention relates to an isolated gene encoding a polypeptide of human hepatitis C virus 
(hereinafter, referred to as HCV), or a fragment thereof, and a polypeptide encoded thereby. 

5 

Background of the invention 



Hepatitis viruses A, B and D have been identified and the serodiagnosis for each virus has been 
established before the present invention. However, there was at least one hepatitis whose cause remained 
70 unknown (Digestive Diseases and Sciences, 31^: 122S-132S (1986); and Seminars in Liver Diseases, 6: 56- 
66(1986)). ~~ 

Serodiagnosis for hepatitis A virus (HAV) or hepatitis B virus (HBV) has been established and clinically 
employed since middle of 1970's, which revealed that most of the blood-transfusion-associated hepatitis are 
caused by unknown pathogen(s) other than viruses capable of growing in hepatocytes, such as HAV or 

75 HBV. The hepatitis caused by unknown pathogen was designated as n non-A, non-B hepatitis (NANBH)". In 
the United States, the incidence of hepatitis following the "transfusion" is about 1 to 10% of the total 
patients undergone transfusion, and more than 90% of said post-transfusional hepatitis are reported to be 
NANBH (Jikken igaku, 8,3: 15-18 (1990)). In Japan, about 200,000 patients, corresponding to about 10 - 
20% of those undergone transfusion, are suffering from the post-transfusional hepatitis every year, and 

20 about 95% of them are diagnosed as NANBH. Furthermore, about 300,000 people are diagnosed as 
sporadic hepatitis every year and about 40 to 50% of them are considered to be NANBH. There are also 
epidemic NANBH in Japan. Although infectious route for NANBH has not been established in contrast with 
hepatitis A or B, it is likely different from those for hepatitis A and B (Jikken Igaku, 8, 3: 13-14 (1990)). 

Chiron Corp. (May, 1988) has succeeded in isolating a gene fragment of a virus responsible for NANBH 

25 by means of an unique technique quite different from conventional ones and designated said virus as 
hepatitis C virus (HCV). Many researchers followed the work and sequenced the entire gene encoding both 
of non-structural and structural proteins of HCV (Shimotohno et ah, Proc. Natl. Acad. Sci. USA, 87: 9524- 
9526 (1990); and Takamizawa et al., Journal of Virology 65, 3: 1 105-1 113 (1991)). ~~ 

Many Patent Applications directed to HCV gene have been done so far, for example, European Patent 

30 Publication Nos.318216. 388232, 398748, 419182, 450931, 464287, 463848, 468657, WO 91/01376, WO 
91/15516 and British Patent No.2239245, and the like. 

Chiron corp. and Ortho, Inc. have developed an Enzyme-linked Immunosorbent Assay (ELISA) for HCV 
and a kit therefor, using a recombinant antigen (clone C100-3) which was obtained by transforming yeast 
cells with an expression plasmid encoding a fused peptide comprising a human super-oxide dismutase and 

35 a 363 amino acid polypeptide encoded by a gene encoding a region from NS3 to NS4 encoding a part of 
non-structural protein, growing transformants under a condition to allow the transformants express said 
fused peptide (WO No. 89/04669; and European Patent Publication No.318216). 

The Japanese Welfare Ministry (KOSEI-SHO), leading other nations in the world, decided to introduce 
said Chiron's kit into the screening and detection of anti-HCV antibody and the import thereof started on 

40 December 26, 1989. From the next day, the Japanese Red Cross began screening for anti-HCV antibody in 
blood offered by volunteers using the kit. About 1.7 million of people are estimated to undergo blood 
transfusion yearly. Before the screening, the incidence of post-transfusional hepatitis among them was 
about 12.3% (about 173,000), and thereafter, it reduced to about 3%. 

As an outstanding and critical feature, the variability of HC-associated antigen is often suggested. For 

45 instance, homology in amino acid and base sequences between C100-3 clone and a clone obtained in 
Japan was reported to be about 80%. The difference is only 20% though, it can affect on the accuracy of 
the detection of HCV. In another aspects, homology between HC-associated antigens varies from a region 
to region, for example, it is only 70% regarding the all or a part of NS1 , NS2, NS3 and NS5 regions 
(according to the designation by Chiron Corp.), which indicates that some substances may be overlooked 

so by Chiron's kit. As is often the case with virus, especially that has RNA genome, a genetic mutation occurs 
at a high frequency, which leads to a change in antigen determinant sites. As a result, HC-associated 
antigen presented by antigen-presenting cells in serum and antibody raised against it also change in the 
course of disease. 

The another kit provided by Ortho, Inc. is accurate in detecting anti-HCV antibodies raised during a 
55 restricted period of diseas , that is, antibodies raised during a period while the disease progresses from an 
acute stage to a chronic stage, which begins about 24.7 weeks after the infection (SAISHIN IGAKU, 45, 12: 
2331-2336 (19909); IGAKUNOAYUMI, 151: 892-896). Thus, the Ortho's kit is not effective for detecting 
antibodies raised against all the HC-associated antigens throughout the disease, especially those presented 
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during acute and chronic stag s. 

Accordingly, an assay method useful for the detection of any anti-HCV antibodies raised against various 
HC-associated antigens exist in serum of a patient throughout the disease has been needed. HCV is 
detectable in hepatocytes of patients in various phases of disease, including acute, chronic hepatitis, 

5 hepatocirrhosis, and hepatoma. Recently, interests are concentrated on the pathogenetic relationship 
between HCV infection and hepatoma because about 50 - 60% of patients of hepatoma are HCV positive. 
Although the pathogenetic relationship between HCV infection and hepatoma has not been established, it is 
generally accepted that there are some relationships between chronic hepatitis, hepatocirrhosis and 
hepatoma. Therefore, screening for anti-HCV antibody in serum of a subject susceptible to them may 

w helpful for preventing such serious diseases. Thus, more accurate and efficient screening method, as well 
as serodiagnosis, is strongly desired to prevent HCV-related diseases. For this purpose, a reagent and a kit 
having a extended utility in, for example, the assay of serum of a variety of subjects including carriers of 
HCV without manifesting symptoms, patients suffering from HC of various stages, such as acute, chronic, or 
progressed hepatitis, is necessary. As the number of HCV-infected patients increases, the% of HCV- 

15 contamination in blood offered by volunteers increases. This causes a serious problem all over the world, 
for instance, the% of HCV positive blood in total blood offered by volunteers is about 10, 0.8, 1.5, and 1.2% 
in Japan, USA, Italy, and Spain, respectively (TANPAKUSITU, KAKUSAN, KOSO, 36, 10: 1679-1691 (1991)- 
). However, there are no effective methods for treating HCV infection, and therefore a method for detection 
of HCV in serum of suspected subjects is strongly required to prevent HCV-related diseases. 

20 

Summary of the Invention 

As previously mentioned, HCV gene is extremely liable to vary and subtypes of HCV should differ to a 
great extent at various sites including surface antigenic sites and others responsible for the determination of 
25 significant features of HCV protein. As these mutated viruses induce hepatitis C of different symptoms 
depending on the type when infected to human, variants low in homology are considered to be different 
from each other. 

In this regard, the present inventors isolated plural viruses which differ from each other in terms of 
amino acid and DNA sequences from sera of patients of HC (HC patients). 

30 The present invention was established by isolating a novel hepatitis C virus, separating RNA encoding 
viral protein, converting RNA into cDNA using reverse transcriptase, and cloning and sequencing the 
resultant DNA. When the isolated DNA was transformed into host cells after ligating to an appropriate 
expression vector, transformants expressed HC-associated antigen. 

DNA obtained by transcribing the RNA of HCV encodes recombinant antigen which is immunochemical- 

35 ly th© same as HCV-associated antigen. Therefore, for the purpose of the invention, the terms "cDNA", 
"DNA" and "gene" are used interchangeably, as far as they encode the same protein(s) or antigens as 
those encoded by RNA gene of HCV. As one of skill will easily appreciate, a DNA fragment encoding an 
epitopic site of HCV-associated antigen is also useful to produce a polypeptide capable of specifically 
reacting with anti-HCV antibody in the same manner as intact HC-associated antigen. Therefore such a DNA 

40 fragment is also useful for the purpose of the invention. 

Thus, the present invention provides an isolated gene of a novel hepatitis C virus and a fragment 
thereof. The HCV gene and its fragment of the invention are useful for the development of a diagnostic 
method which is more accurate and effective than conventional ones in the detection of antibodies raised 
against a wide range of HCVs which have been hardly detected before the present invention. The gene and 

45 fragments thereof are also useful for the preparation of a novel vaccine. 

In another aspects of the invention, an in vitro screening system for a substance capable of specifically 
suppressing or controlling a proteolytic processing of a precursor polypeptide of HCV can be obtained. The 
screening system can be established by analyzing viral protease intimately. The analysis can be carried out 
by synthesizing a + strand of RNA from a double-stranded DNA containing HCV-originated protease gene 

so and its adjoining regions, producing a polypeptide comprising viral protease in vitro, characterizing said 
protease as to the activity, specificity, function, and the like. 

In another aspects, the present invention provid s an in vivo screening system for the substance 
capable of suppressing the processing of viral precursor protein. The screening can be carried out using a 
transformant, for example, eucaryotic cells such as animal cells which have been transformed with DNA 

55 fragment of the invention, and can express a precursor polyp ptide of HCV and proc ss the product 
intracellularly. 

Specifically, the present invention provides an isolated DNA (gene) encoding all or a part of polypeptide 
having an amino acid sequence of any of SEQ ID NO 1 to 43, 64 to 75 and 101 to 104, or fragment thereof. 
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The present invention further provides polypeptide having all or a part of amino acid sequence of any of 
SEQ ID. Nos. 1 to 43, 64 to 75 and 101 to 104. 

The polypeptides of the invention have an ability to immunochemically and specifically react with 
antiserum obtained from patients suffering from h patitis C. 
5 As the amino acid and DNA sequences of polypeptide of HCV are determined, it is easy to obtain 
active derivatives of viral protein which falls within the scope of the present invention by conventional 
methods which leads to the insertion, deletion, replacement or addition of amino acids without changing the 
specific reactivity with sera from patients suffering from hepatitis C. This can be conducted by, for example, 
a site-specific mutagenesis of DNA. 
w Therefore, the present invention also provides active derivatives of HCV protein obtained by conven- 
tional methods, and DNA fragments encoding it, which can immunochemically react with antiserum raised 
against HC-associated antigen. 

In this regard, the present invention provides a polypeptide fragment having a modified amino acid 
sequence derived from polypeptides having amino acid sequence of SEQ ID NO 1 - 104, and being 
75 capable of reacting with serum of HC patients with a different specificity, for example, those claimed in 
Claims 119, 121, 123, 125, 127, 129, 131 - 136, 138. 140 - 154, 157 - 179, 185 - 199, wherein the 
modification has been done by deletion, insertion, modification or addition of amino acid(s) subject to that 
the ability to react with antiserum from HC patients is not decreased: 

Furthermore, the present invention provides an expression vector which comprises DNA shown by 
20 either of SEQ ID NO 1 to 43, 64 to 75 and 101 to 104 or a fragment thereof and has an ability to allow a 
host cell to express said DNA when transformed into the same. 

The present invention also provides a transformant transformed with the expression vector. 

The present invention further provides a method for preparing HCV protein or HCV-associated antigen 
by culturing a transformant in a medium and recovering the product from the cultured broth. 

25 

Definition 



For the purpose of the invention, the following terms are defined below. 

HCAg: HCV- or HC-associated antigen. For the purpose of the invention, as hepatitis C is caused by 
30 HCV, the terms "HC-associated antigen" and "HCV-associated antigen" are used exchangeably. 

HCAb: antibody raised against HCV-associated antigen. 

HCV protein or HCV polypeptide: protein or polypeptide encoded by HCV gene. 

HCV gene: generally, it is RNA gene of HCV. However, for the purpose of the invention, it refers to 
gene encoding HCV polypeptide or protein encoded by RNA gene. Therefore, the terms "gene", "cDNA", 
35 and "DNA" obtained from RNA gene are used exchangeably. 

Recombinant HCAg: a product (protein or polypeptide, including glycosylated ones) produced in host 
cells transformed by DNA of the invention and is capable of immunochemically reacting with HCAb. 

Recombinant polypeptide: polypeptide expressed by host cells transformed by HCV gene of the 
invention. 

40 HC patient: a patient suffering from hepatitis C. 
Detailed Description of the Inventio n 
[1 ] Gene Encoding Core-envelope Region 

45 

(1) Preparation of cDNA clone of SEQ ID NO 1 - 12 and sequencing thereof 

The cDNA clones of SEQ ID NO 1 - 12 which encode a novel polypeptide of core-envelope region of 
HCV protein were cloned from serum from HC patients as follows. 

so The cloning and sequencing of cDNA encoding HCV polypeptide can be carried out using any of known 
methods. However, it is hardly accomplished by known "Okayama-Berg" or "Gubler-Hoffman" method 
because the content of HCV in serum is only a slight amount and HCV gene is liable to variation. The 
present inventors succeeded in the cloning of gene from a slight amount of serum as will be hereinafter 
described in Example 1. Briefly, it was conducted by extracting nucleic acids from a serum of a patient 

55 suffering from HC. It is preferable to use serum showing OD value of 3.5 on a screening kit of Ortho. Before 
the extraction, it is desirable to add tRNA or polyribonucleoside to the serum as a carrier for viral RNA. For 
the purpose of the invention, tRNA is preferable because the degradation of RNA can be easily detected, at 
least after the addition of tRNA, by monitoring the existence of a sufficient amounts of tRNA having an 
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intact length on electrophoresis. 

The r sultant RNA is converted into cDNA using transcriptase in the presence of an appropriate 
oligonucleotide primer. The cDNA is then cloned and amplified by modified polymerase chain reaction 
(PGR) (Saiki et al., Nature 324: 126 (1986)) in the presence of a pair of primers. Although commercially 
available random primers can be used in the PCR, synthetic primers having the following base sequences 
are suitable for the present invention. 



Synthetic Primers 



5' 3' 
S1:CTCCACCATAGATCACTCC(SEQ ID NO: 105) 
S2 : AGGTCTAGTAGACCGTGC ( SEQ ID NO: 106) 
S3 : AGGAAGACTTCCGAGCGG ( SEQ ID NO: 107) 
S4 : CGTGAACTATGCAACAGGG {SEQ ID NO: 108) 
AS1 : ACCGCTCGGAAGTCTTCC ( SEQ ID NO: 109) 
AS 2 : GGGCAAGTTCCCTGTTGC (SEQ ID NO: 110) 
AS 3 : GCTGGATTCTCTGAGACG ( SEQ ID NO: 111) 



PCR can be conducted under appropriate conditions, for example, those described in Example 2 using 
30 the first complemental DNA (1st cDNA) as a template. The condition may vary depending on the primers 
used such as base sequence or combination, length to be amplified, or the like. Examples of pair of primers 
are: S1 - AS1; S1 - AS2; S1 - AS3; S2 - AS1; S2 - AS2; S2 - AS3; S3 - AS2; S3 - AS3; and S4 - AS3. 

The minimum amount of serum required for the cloning described in Example 2 [2] varies depending 
on the content of virus in serum used, however, it may be about 5 to 7 ixl when the serum shows OD 3 or 
35 more on aforementioned Ortho's kit. The base sequence of cDNA obtained using random primer in the 
synthesis of the 1st strand cDNA was the same as that of cDNA obtained using antisence primer which was 
designed and synthesized (Example 8). 

Thus, a region (clone N1-1) was obtained by two different methods. Three clones independently 
obtained from a serum of a patient using random primers are shown as a clone of SEQ ID NO 1 . When 
40 synthetic DNA (S1 and AS1) was used as primers, two clones of three clones obtained independently have 
the same base sequence as that of SEQ ID NO 1 and one clone had a modified base sequence wherein 
three amino acids of SEQ ID NO 1 were changed, i.e., No.345 A to C, No.332 A to T, and No. 95 A to C, 
which shows that there are more than one virus in one patient 

The resultant DNA fragment is then subjected to the determination of base sequence. Generally, three 
45 clones obtained independently are employed and the base sequence of the both strands are determined to 
obtain an entire base sequence. The base sequence is conveniently determined using a fluorescence 
sequencer GENESIS 2000 (DUPONT) according to the protocol attached thereto. Alternatively, a conven- 
tional subcloning can be used when the DNA fragment consists more than 180 nucleotides or contains a 
region which is hardly determined by fluorescence sequencer. 
so Thus obtained base sequences are shown in SEQ ID NO 1 to 12. 

For the purpose of the invention, a part of base sequences may be changed, for example, No. 345 A to 
C, No. 332 A to T, and No. 95 A to C, respectively. 

(2) Expression of Polypeptides Encoded by Clones 

55 

DNA fragments of SEQ ID NO 1 to 12 can be used to produce a recombinant HCAg by constructing an 
expression vector containing DNA encoding a clone, by inserting the DNA into a known expression vector at 
an appropriate site of the vector, downstream from a promoter, using a well known method per se, and 
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introducing the expression vector harboring the DNA into a host cell such as Escherichia coli cell, yeast 
c II, animal c II or the like according to the method known to one of skill, culturing the tranlformant in a 
medium under an appropriate condition, and recovering a product from the cultured broth. 

The present invention can be accomplished using any expression v ctors which have a promoter at an 
5 appropriate site to direct the expression of a DNA encoding HCV polypeptide or a fragment thereof. 
Expression vectors preferably contain promoter, ribosome binding (SD) sequence, gene encoding HCV 
polypeptide, transcription termination factor, and a regulator gene. 

Expression vectors functional in microorganisms such as Escherichia coli, Bacillus subtilis or the like 
will preferably comprise promoter, ribosome binding (SD) sequence, HCV-associated-protein-encoding 
w gene, transcription termination factor, and a regulator gene. 

Examples of promoters include those derived from Escherichia coli or phages such as tryptophane 
synthetase (trp), lactose operon (lac), Xphage P L and P R , T 5 early gene P 2 s, P26 promoter and the like. 
These promoter may have modified or designed sequence for each expression vector such as pac 
promoter. 

75 Although the SD sequence may be derived from Escherichia coli or phage, a sequence which has been 
designed to contain a consensus sequence consisting of more than4 bases, which is complementary to the 
sequence at the 3' terminal region of 16S ribosome RNA, may also be used. 

The transcription termination factor is not essential. However, it is preferable that an expression vector 
contains a p-independent factor such as lipoprotein terminator, trp operon terminator or the like. 
20 Preferably, these sequences required for the expression of the a gene encoding HCAg originated from 
HCV are located, in an appropriate expression plasmid, in the order of promoter, SD sequence, said gene 
and transcription termination factor from 5' to 3' direction. 

Typical example of expression vectors is commercially available pKK233-2 (Pharmacia). However, a 
series of plasmids pGEX (Pharmacia), which are provided for the expression of a fused protein, are also 
25 employable for the expression of HCAg-encoding gene of the present invention. 

A suitable host cell such as Escherichia coli can be transformed with an expression vector comprising a 
DNA of the invention by any of known methodssuch as protocol provided by TOYOBO Japan (Example 3). 

The cultivation of the transformants can be carried out using any of well known procedures in literatures 
such as Molecular Cloning. 1982, and the like. The cultivation is preferably conducted at a temperature from 
30 about 28 * C to 42 • C. 

Expression vectors used for transforming other host cells, such as those derived from insects or 
animals including mammals, consist of substantially the same elements as those described in the above. 
However, there are certain preferable factors as follows. 

When insect ceils are used, a commercially available kit M AXBAC ™ is employed according to the 
35 teaching of the supplier (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4). In 
this case, it is desirable to make a modification to reduce the distance between the promoter of polyhedrin 
gene and the initiation codon so as to improve the expression of the gene. 

When animal ceils are used as hosts, expression vectors preferably contain SV40 early promoter, SV40 
late promoter, apolipoprotein E gene promoter, or the like. Specifically, known expression vectors such as 
40 pKCR (Proc. Natl. \cad. Sci. USA, 78: 1528 (1981)), pBPV MT1 (Proc. Natl. Acad. Sci. USA, 80: 398 (1983)- 
), or the like may be employed after minimum modification, for example, an insertion of cloning site as 
described in a literature (Nature, 307: 604 (1984)), so that the resultant vectors maintain essential functions 
to serve as expression vectors. 

Animal cells usable in the present invention are CHO ceil, COS cell, mouse L cell, mouse C127 cell, 
45 mouse FM3A cell and the like. 

The recombinant polypeptide expressed by host cells such as microorganisms including E. coli, insect 
cells and animal cells can be recovered from the cultured broth by known methods and identified by, for 
example, immunoreacttons between the expressed product and antiserum obtained from HC patients using 
a conventional method such as Western blot analysis. 

so As the result, polypeptides having amino acid sequence of SEQ ID NO 1 to 12 were obtained as 
expression products of cDNA obtained from serum of HC patients and identified as HCAg. Among them, 
polypeptides having 191 amino acid sequence from No. 1 to No. 191 of SEQ ID NO 5, 6 and 8 are 
assumed to be polypeptides which were expressed and cleaved by proc ssing in insect cells. Thus, the 
sequences of SEQ ID NO 5, 6 and 8 comprise: from No. 174 to No. 188 (region A), amino acid sequence 

55 containing mainly hydrophobic amino acids having a large side chain of high r molecular weight; and at 
Nos. 189 and 191, alanine, a r sidue having a small side chain of lower molecular weight. This pattern of 
sequence keeps a feature of signal region which is recognized by signal peptidase in animal (including 
insect) cell. The 5 - and 3'- regions of said sequence contain many variations in amino acid sequence 
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resulting from variations in base sequence due to the replacement of a part of said sequence, when 
compared with known HCV genes cloned before the present invention. However, the r gions A and B 
contain less variations which indicates that the polypeptide may be cleaved at C-terminus of the No. 191 
alanin by signal peptidase. 

5 Polypeptide having amino acid sequence from No. 1 to No.191 of SEQ ID NO 5, 6, and 8 is assumed to 
be core or matrix protein on the basis of the homology between said amino acid sequence and a known 
sequence of core or matrix region of viral protein of Japanese encephalitis virus or yellow fever virus. The 
polypeptide comprising said 191 amino acid sequence is herein referred to as "core protein" or "core 
region". 

w Polypeptides having 18 amino acid sequence from No. 1 to No.18 of SEQ ID NO 1, 9, 10, 11 and 12, 
34 amino acid sequence from No. 40 to No.73 of SEQ ID NO 3, and 35 amino acid sequence from No. 81 
to No.115 of SEQ ID NO 3 are relatively highly hydrophilic and highly homologous to polypeptides having 
amino acid sequences deduced from known HCV genes cloned by Chiron, Shimotohno or Takamizawa 
(ibid ) and are useful as HCV-associated antigenic peptide in diagnosis and/or for the preparation of vaccine. 

75 Polypeptides having 18 amino acid sequence from No. 40 to No.57 of SEQ ID NO 4, and 12 amino acid 
sequence from No. 240 to No.251 of SEQ ID NO 4 are relatively highly hydrophilic and extremely low in 
homology with polypeptides having amino acid sequences deduced from known HCV genes cloned before 
the present invention and are useful as HCV-associated antigenic peptide in diagnosis. These polypeptides 
can be produced by chemical synthesis, as well as by DNA recombinant technique. 

20 Furthermore, a polypeptide having 115 amino acid sequence from No. 1 to No. 115 of SEQ ID NO 3, 
corresponding to an epitopic region of core protein, and a polypeptide having 191 amino acid sequence 
from No. 1 to No.191 of SEQ ID NO 3, corresponding to the total region of core protein, can be produced in 
large scale by DNA recombinant technique and are useful as diagnostic reagent and/or vaccine. 

Among them, a polypeptide having 192 amino acid sequence from No. 31 to No. 222 of SEQ ID NO 4 

25 is assumed to be a polypeptide which was expressed and cleaved by processing in insect cells. Thus, the 
sequence of SEQ ID NO 4 comprises: from No. 13 to No. 29 (region A), amino acid sequence containing 
mainly hydrophobic amino acids having a side chain of higher molecular weight; at No. 30, alanine, a 
residue having a side chain of lower molecular weight; from No. 210 to No. 221 (region B), amino acid 
sequence containing mainly hydrophobic amino acids having a side chain of higher molecular weight; at No. 

30 222, glycine, a residue having a side chain of lower molecular weight. This pattern of sequence keeps a 
feature of signal region which is recognized by signal peptidase in animal (including insect) cell. 

The 5'- and 3'- regions of said sequence contain many variations in amino acid sequence resulting from 
those in base sequence due to the replacement of base sequences, when compared with known HCV gene 
cloned before the present invention. However, the regions A and B contain less variations, indicating that 

35 the polypeptide may be cleaved by signal peptidase at C-terminus of the No. 30 alanine. 

The polypeptide is assumed to be envelope protein of HCV or a fragment thereof on the basis of the 
low homology between the base sequence encoding said polypeptide and a known base sequence which 
encodes a corresponding region of HCV protein. The polypeptide comprising said 192 amino acid 
sequence is herein referred to as "M-gp35 protein" or "M-gp35 region". 

40 Polypeptides having 19 amino acid sequence from No. 134 to No.152 of SEQ ID NO 4, 17 amino acid 
sequence from No. 223 to 239 of SEQ ID NO 4, and 18 amino acid sequence from No. 92 to No.109 of 
SEQ ID NO 4 are relatively highly hydrophilic and highly homologous to polypeptides having amino acid 
sequences deduced from known HCV genes and are useful as HCV-associated antigenic peptide in 
diagnosis and/or for the preparation of vaccine. 

45 Polypeptides having 18 amino acid sequence from No. 40 to No.54 of SEQ ID NO 4, and 12 amino acid 
sequence from No. 240 to No.251 of SEQ ID NO 4 are relatively highly hydrophilic and extremely low in 
homology with polypeptides having amino acid sequences deduced from known HCV genes cloned before 
the present invention and are useful as HCV-associated antigenic peptide for diagnosis. 

These polypeptides can be produced by chemical synthesis, as well as by DNA recombinant technique. 

so Furthermore, polypeptides having 76 amino acid sequence from No. 31 to No. 106 of SEQ ID NO 4; 
and 36 amino acid sequence from No. 134 to No. 169 of SEQ ID NO 4, which correspond to an epitopic 
region of M-gp35 protein can be produced in larg scale by DNA recombinant technique and are useful as 
diagnostic reagent and/or vaccine. 

55 [2] Gene Encoding NS1(gp70) Region 

(1) Preparation of cDNA clone of SEQ ID NO 13 - 27 and sequencing thereof 
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The cDNA clones of SEQ ID NO 13 - 27 which encode a novel polypeptide of NSl(gp70) region of HCV 
protein and fragments thereof were cloned from serum from HC patients as follows. 

The cloning and sequencing of cDNA encoding HCV polypeptide can be carried out using any of known 
methods. However, it is hardly accomplished by known "Okayama-Berg" or "Gubler-Hoffman" method 
because the content of HCV in serum is only a slight amount and HCV gene is liable to variation. The 
present inventors succeeded in the cloning of gene from a slight amount of serum as will be hereinafter 
described in Example 1 . Briefly, it was conducted by extracting nucleic acids from a serum of a patient 
suffering from HC. It is preferable to use serum showing OD value of 3.5 on a screening kit of Ortho. Before 
the extraction, it is desirable to add tRNA or polyribonucleoside to the serum as a carrier for viral RNA. For 
the purpose of the invention, tRNA is preferable because the degradation of RNA can be easily detected, at 
least after the addition of tRNA, by monitoring the existence of a sufficient amounts of tRNA having an 
intact length on electrophoresis. 

The resultant RNA is converted into cDNA using transcriptase in the presence of an appropriate 
oligonucleotide primer. The cDNA is then cloned and amplified by modified PCR (Saiki et al., Nature 324: 
126 (1986)) in the presence of a pair of primers. Although commercially available random primers carTbe 
used in the above procedures, synthetic primers having the following base sequences are suitable for the 
present invention. 

Synthetic Primers for Cloning of HCV Gene 

5' 3' 
MS 122: GAGGCCGTGAACTGCGATGA ( S EQ ID NO: 112) 
MS 14 8 : TTCTCTAAGGTGGCNTCNGCNTG ( SEQ ID NO: 113) 
MS 157: CCGGACGCGTTGAANCTNTGNGT ( SEQ ID NO: 114) 
MS 123: CATCCAGGTACAACCGAACC A ( S EQ ID NO: 115) 
MS146 :AACACACGGCCGCCNCANGGNAA(SEQ ID NO: 116) 
MS 156: CCGGATCCCACAAGCCGTNGTNGA ( SEQ ID NO: 117) 



In the above sequences, the letter "N" refers to inosine. The above sequences are only illustrative and 
these base sequences are not critical. They can be modified by replacing nucleotide(s) with other(s), or 
deleting or inserting nucleotide(s). The replacement may be preferably introduced within 10 bases from 5' 
terminus involving 1 to several nucleotides, more preferably, within 5 bases involving less than 5 
nucleotides. The deletion may occur in the 5' terminal region involving 4 to 5 nucleotides, preferably, within 
several bases from the 5' terminus involving a few nucleotides. In case of insertion, it may be an addition of 
8 to 12, preferably 5 to 6, more preferably, a few nucleotides in 5' terminal region. 

PCR can be conducted under appropriate conditions, for example, those described in Example 9 using 
the first complemental DNA (1st cDNA) as a template. The condition may vary depending on the primers 
used such as base sequence or combination, length to be amplified, or the like. Examples of pair of primers 
are : MS122 - MS123; MS157 - MS156; and MS148 - MS146. The resultant cDNA is then inserted into an 
appropriate site of a cloning vector such as at Sma l site of pUCl9. A cloning vector harboring the DNA 
fragment is subjected to the determination of base sequence. Generally, three clones obtained indepen- 
dently are employed and the base sequence of the both strands are determined to obtain an entire base 
sequence. The sequence is conveniently determined using a fluoresc nee sequencer GENESIS 2000 
(DUPONT) according to th protocol attached thereto. Alternativ ly, a conventional subcloning can b used 
when the DNA fragment consists more than 180 nucleotides or contains a region which is hardly 
det rmined by fluorescenc sequencer. 

Thus obtained base sequences of DNA fragments are shown in SEQ ID NO 13 to 27. 

Clones N19-1, 2, 3, N27-1, 2 and 3 were obtained from serum of a patient N, and clones H19-2, 4, 10, 
Y19-4, 6 and 7 were obtained independently from patients H, and Y, respectively. Clones MX24-4, 5 and 13 
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were obtained from a pool comprising sera from multiple patients. 

Clones of SEQ ID NO 13 to 15 were obtained using primers MS157 and MS156 represent the sam 
region of HCV gene designated as N27. Clones of SEQ ID NO 16 to 24 were obtained using prim rs 
MS122 and MS123 also represent the same region of HCV gen designated as N19, and clones of SEQ ID 

s NO 25 to 27 were obtained using primers MS148 and MS146 also represent the same region of HCV gene 
designated as MX24. The comparison between base sequence of each clone and that of known HCV gene 
(Kato et al., Proc. Natl. Acad. Sci. USA, 87; 9524-9528 (1990); and Takamizawa et al., Journal of Virology, 
65,3: 1105-1113 (1991)) indicates that clones align on HCV gene in the other of N27, N19 and MX24, from 
5' to 3\ As there are overlapping regions between clones, these regions were used to ligate clones each 

io other as will be hereinafter described. 

(2) Ligation of Clones of SEQ ID NO 13 to 27 

CDNA clones obtained from serum of HC patients shown by sequences of SEQ ID NO 13 to 27 were 
75 ligated in the following manners. 

1) Ligation of Clones of SEQ ID NO 13 to 15, and Clones of SEQ ID NO 16 to 24 

Each clones of SEQ ID NO 13 to 15 was cleaved at Mlul site at Nos. 330 to 335 from 5' terminus of 
20 base sequences of SEQ ID NO 13 to 15 and ligated to the Mlul site at No. 71 from 5' terminus of base 
sequences of SEQ ID NO 16 to 24 by ligation reaction to yield 27 clones having a DNA fragment which 
comprises, at 5* region, a DNA fragment from clone N27-1 , 2 or 3, and at 3' region, a DNA fragment from 
clone N19-1, 2, 3, clone H19-2, 4, 10, clone Y19-4, 6 or 7. Thus, by the ligation reaction between N27-3 and 
N19-1, a clone N27N19-1 of SEQ ID NO 28 was obtained. 

25 

2) Ligation of Clones of SEQ ID NO 16 to 24, and Clones of SEQ ID NO 25 to 27 

Each clones of SEQ ID NO 16 to 24 was ligated to each clones of SEQ ID NO 25 to 27 by PCR. There 
obtained 54 kinds of clones. Thus, 27 clones have a DNA fragment which encodes either of polypeptides 

30 which contain: from N-terminus (amino-terminus) to amino acid No.1'31, amino acid sequence comprising 
131 amino acid residues from N- to C-termini of SEQ ID NO 16 to 24, and from amino acid No. 132 to C- 
terminus (carboxy-terminus), amino acid sequence from No. 16 to C-terminus of SEQ ID NO 25 to 27. Thus, 
a clone obtained by the ligation reaction between N19-1 and MX24-4 is the clone N19MX24A-1 of SEQ ID 
NO 29. The others have a DNA fragment which encodes either of polypeptides which contain: at N-terminal 

35 region, amino acid sequence from N-terminus to amino acid No. 116 of SEQ ID NO 16 to 24, and from 
amino acid No. 117 to C-terminus (carboxy-terminus), amino acid sequence comprising 209 amino acid 
sequence from N- to C-termini of SEQ ID NO 25 to 27. Thus, a clone obtained by the ligation reaction 
between N19-1 and MX24-4 is the clone N19MX24B-1 of SEQ ID NO 30. 

40 3) Ugation of Clones of SEQ ID NO 13 to 27 

Each clones of SEQ ID NO 13 to 15 was cleaved at Mlul site at base Nos. 330 to 335 from 5' terminus 
of base sequences of SEQ ID NO 13 to 15 and ligated to the Mlul site at base Nos. 71 to 76 from 5' of the 
base sequences of the above (2), 2) by ligation reaction to yield clones of N27MX24 series. Thus, a clone 
45 obtained by the ligation reaction between N27-3 and N19MX24A-1 is the clone N27MX24A-1 of SEQ ID NO 
31 and a clone obtained by the ligation reaction between N27-3 and N19MX24B-1 is the clone N27MX24B- 
1 of SEQ ID NO 32. 

On the basis of the homology between the amino acid sequence of the clone and that reported 
previously (Kato et al., Proc. Natl. Acad. Sci. USA, 88: 5547-5551 (1991); and Hijikata et al., in: Congress of 
so Association of Japan Molecular Biology, November 29, 1990), the clone N27N19MX24A-1 proved to be the 
entire region of a gene encoding gp70 polypeptide reported by Kato et al. Thus, polypeptide comprising 
amino acids from Nos. 46 to 395 of SEQ ID NO 31 and 32 corresponds to th gp protein presented by Kato 
et al. 

On the other hand, polypeptide comprising amino acids from Nos. 1 to 45 and polypeptide comprising 
55 amino acids from No. 46 to the C-terminus of SEQ ID NO 13 to 15 correspond to the C-terminal region of 
gp 35 polypeptide and N-terminal region of gp70 polypeptide reported by Hijikata et al, respectively. 
Further, the amino acid sequence from No. 46 to C-terminus of SEQ ID NO 13 to 15 corresponds to a 
sequence from N-terminus to amino acid No.67 of gp70 reported by Kato et al, and the amino acid 
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sequence from N- to C-termini of SEQ ID NO 16 to 24 corresponds to a sequence from amino acid Nos. 42 
to 172 of gp70 and represents a fragment of gp70 protein presented by Kato et al. 

The amino acid No.1 of SEQ ID NO 25 to 27 corresponds to the amino acid No.158 from N-terminus of 
a sequenc r ported by Hijikata t al., and also th amino acid No. 350 of SEQ ID NO 25 to 27 corresponds 
to C-terminal amino acids of gp70 reported by Shimotohno et al., and a polypeptide comprising amino 
acids from Nos. 194 to C-terminus of SEQ ID NO 25 to 27 corresponds to the N-terminal region of non- 
structural protein of HCV (NS2). 

The ligation products prepared in 2) code all or a part of gp70 polypeptide reported by Hijikata et al. 
For example, the polypeptide from amino acid Nos. 46 to 395 of SEQ ID NO 31 or 32 corresponds to gp70 
protein of Hijikata et al. Although a protein expressed from a HCV gene encoding a polypeptide from amino 
acid Nos. 46 to 395 of SEQ ID NO 31 or 32 is gp70 protein, said expression product is herein referred to as 
M-gp70, in contrast with gp70 reported by Hijikata et al in: Congress of Association of Japan Molecular 
Biology, November 29, 1990. 

(3) Expression of Polypeptides Encoded by Clones 

DNA fragments obtained in the above 1) and 2) can be used to produce a recombinant HCAg by 
constructing an expression vector containing DNA encoding a clone, by inserting the DNA into a known 
expression vector at an appropriate site of the vector, downstream from a promoter, using a well known 
method per se, and introducing the expression vector harboring the DNA into a host cell such as 
Escherichia coli cell, yeast cell or the like according to the method known to one of skill, culturing the 
transformant in a medium under an appropriate condition, and recovering a product from the cultured broth. 

The present invention can be accomplished using any expression vectors which have a promoter at an 
appropriate site to direct the expression of a DNA encoding HCV polypeptide or a fragment thereof. 
Expression vectors preferably contain promoter, ribosome binding (SD) sequence, gene encoding HCV 
polypeptide, transcription termination factor, and a regulator gene. 

Expression vectors functional in microorganisms such as Escherichia coli, Bacillus subtilis or the like 
will preferably comprise promoter, ribosome binding (SD) sequence, HCV-associated-protein-encoding 
gene, transcription termination factor, and a regulator gene. 

Examples of promoters include those derived from Escherichia coli or phages such as tryptophane 
synthetase (trp), lactose operon (lac), Xphage P L and P R , T 5 early gene P 25 , P26 promoter and the like. 
These promoter may have modified or designed sequence for each expression vector such as pac 
promoter. 

Although the SD sequence may be derived from Escherichia coli or phage, a sequence which has been 
designed to contain a consensus sequence consisting of more than~4 bases, which is complementary to the 
sequence at the 3* terminal region of 16S ribosome RNA, may also be used. 

The transcription termination factor is not essential. However, it is preferable that an expression vector 
contains a p-independent factor such as lipoprotein terminator, trp operon terminator or the like. 

Preferably, these sequences required for the expression of a gene encoding HCAg originated from HCV 
are located, in an appropriate expression plasmid, in the order of promoter, SD sequence, said gene and 
transcription termination factor from 5 f to 3* direction. 

Typical example of expression vectors is commercially available pKK233-2 (Pharmacia). However, a 
series of plasmids pGEX (Pharmacia), which are provided for the expression of fused protein, are also 
employable for the expression of HCAg-encoding gene of the present invention. 

A suitable host cell such as Escherichia coli can be transformed with an expression vector comprising a 
DNA of the invention by any of known methods" such as protocol provided by TOYOBO Japan as described 
in Example 10. 

The cultivation of the transformants can be carried out using any of well known procedures in literatures 
such as Molecular Cloning, 1982, and the like. The cultivation is preferably conducted at a temperature from 
about 28*Cto42°C. 

Expression vectors used for transforming other host cells, such as those derived from insects, consist of 
substantially the same elements as those described in the above. However, th r ar certain preferable 
factors as follows. 

When insect cells are used, a commercially available kit, MAXBAC™ is employed according to the 
teaching of the supplier (MAXBAC ™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4). In 
this case, it is desirable to make a modification to reduce the distance between the promoter of polyhedrin 
gene and the initiation codon so as to improve the expression of the gene. 

A clone of the invention can be inserted into an expression vector for procaryotic cells such as E.coli or 
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eucaryotic ceils such as animal cells after modifying the DNA sequence to bring it in conformity with a 
fram of initiation codon of said vector. Alternatively, an initiation codon is add d at the 5' terminus of DNA 
so as to an appropriate translational frame can be produced. The term "a translational frame" of a clone 
refers to a frame of a bas sequenc in which bases are described as triplets capable of encoding amino 
5 acid sequence as illustrated in SEQ ID NO 13 to 32. 

The recombinant polypeptide expressed by host cells such as microorganisms including E. coli and 
insect cells and animal cells can be recovered from the cultured broth by known methods and identified by, 
for example, immunoreactions between the expressed product and antiserum obtained from HC patients 
using a conventional method such as Western blot analysis. 

w E. coli cells transformed with any of clones obtained in the above (1) and (2) express polypeptides 
encoded thereby as a single polypeptide without cleaving between regions gp35, gp70 and NS2. 

When any of clones obtained in the above (1) and (2) is expressed in insect cells, the expressed 
polypeptide is cleaved between regions gp35, gp70 and NS2. Thus, clone N27MX24A-1 or N27MX24B-1 
was transformed into insect or animal cells, polypeptide M-gp70 derived from each clone N27MX24A-1 or 

75 N27MX24B-1 was expressed as a glycoprotein after processing. 

The following polypeptides comprising amino acid sequence of SEQ ID NO 31 or 32 are relatively 
highly hydrophilic and homologous to amino acid sequence deduced from known HCV gene cloned before 
the present invention: a polypeptide consisting of 13 amino acids from amino acid Nos. 143 to 155; a 
polypeptide consisting of 21 amino acids from amino acid Nos. 171 to 191 subject to that it contains at 

20 least amino acids from Nos. 182 to 187; a polypeptide consisting of 14 amino acids from amino acid Nos. 
202 to 215 subject to that it contains at least amino acids from Nos. 202 to 209; a polypeptide consisting of 
13 amino acids from amino acid Nos. 244 to 256; and a polypeptide consisting of 21 amino acids from 
amino acid Nos. 299 to 319. 

The M-gp70 is a glycoprotein which located adjacent to C-terminus of envelope protein (M-gp35) on 

25 HCV gene and contains potential trans-membrane region. These facts lead to an assumption that ail or a 
part of gp70, whose function has not been established yet, may be a part of envelope protein. On the basis 
of this assumption, the above five kinds of polypeptide fragments, as well as a polypeptide consisting of 
106 amino acids from Nos. 109 to 214 and that consisting of 92 amino acids of amino acid sequence from 
Nos. 233 to 324 of SEQ ID NO 31 or 32, which include said fragments, are useful as vaccine. 

30 Furthermore, the following polypeptides which comprise amino acid sequence of SEQ ID NO 31 or 32 
and are expected to be epitopic region of M-gp70 are also useful as vaccine: a polypeptide consisting of 10 
amino acids from amino acid Nos. 252 to 261 subject to that it contains at least amino acids from Nos. 252 
to 256; a polypeptide consisting of 34 or less than 34 amino acids from amino acid Nos. 250 to 283 subject 
to that it contains at least amino acids from Nos. 273 to 279; a polypeptide consisting of 20 amino acids 

35 from amino acid Nos. 77 to 96; a polypeptide consisting of 18 amino acids from amino acid Nos. 306 to 
323; and a polypeptide consisting of 16 amino acids from amino acid Nos. 122 to 137. 

The following polypeptides comprising amino acid sequence of SEQ ID NO 31 or 32 are relatively 
highly hydrophilic and low in homology with amino acid sequences deduced from known HCV genes cloned 
before the present invention: a polypeptide consisting of 12 amino acids from amino acid Nos. 136 to 147 

40 subject to that it contains at least amino acids from Nos. 136 to 142; a polypeptide consisting of 27 amino 
acids from amino acid Nos. 45 to 71 subject to that it contains at least amino acids from Nos. 53 to 69; a 
polypeptide consisting of 9 amino acids from amino acid Nos. 193 to 201. 

These polypeptides can be produced by chemical synthesis, as well as by DNA recombinant technique. 
Furthermore, a polypeptide having 106 amino acid sequence from Nos. 109 to 214 and a polypeptide 

45 having 92 amino acid sequence from Nos. 233 to 324 of SEQ ID NO 31 or 32 can be produced in large 
scale by DNA recombinant technique. 

[3] Genes Encoding NS2 - NS4 Regions 

so (1) Preparation of cDNA clone of SEQ ID NO 33 - 44 and sequencing thereof 

The cDNA clones of SEQ ID NO 33 - 44 which encode a novel polypeptide of NS2 - NS4 regions of 
HCV protein and fragments thereof wer cloned from serum from HC patients as follows. 

The cloning and sequencing of cDNA encoding HCV polypeptide can be carried out using any of known 
55 methods. However, it is hardly accomplished by known "Okayama-Berg" or "Gubler-Hoffman" method 
because the content of HCV in serum is only a slight amount and HCV gene is liabl to variation. The 
present inventors succeeded in the cloning of gene from a slight amount of serum as will be hereinafter 
described in Example 1 . Briefly, it was conducted by extracting nucleic acids from a serum of a patient 
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suffering from HC. It is preferable to use serum showing OD value of 3.5 on a screening kit of Ortho. Before 
the extraction, it is desirable to add tRNA or polyribonucleoside to the serum as a carrier for viral RNA. For 
the purpose of the invention, tRNA is pr ferable because the degradation of RNA can be easily detected, at 
least after the addition of tRNA, by monitoring the existenc of a sufficient amounts of tRNA having an 
intact length on electrophoresis. 

The resultant RNA is converted into cDNA using transcriptase in the presence of an appropriate 
oligonucleotide primer. The cDNA is then cloned and amplified by modified PCR (Saiki et al., Nature 324: 
126 (1986)) in the presence of a pair of primers. Although commercially available random primers canbe 
used in the above PCR, synthetic primers having the following base sequences are suitable for the present 
invention. 

Synthetic Primers for Cloning of HCV Gene 



5' 3' 
MS4 9 : GACATGCATGTCATGATGTA ( SEQ ID NO: 118) 
MS 88: GGCTGCAGCCGGTTCATCCACTGCAC ( SEQ ID NO: 119) 
MS 100: GCGGATCCTGCTTCGCCCAGAAGGTC ( SEQ ID NO:120) 
MS 1 3 2 : GACACATGTGTTGCAGTCGATC ( SEQ ID NO: 121) 
MS152 : CGGTCCNAGNAGTATCTCNTTNCC ( SEQ ID NO: 122) 
MSI 5 8 : ATGGGCCCGGGNGANAGNAGNCTCCCCCTNCTNTC ( SEQ ID NO: 123) 
MS4 8 : GGCTATACCGGCGACTTCGA( SEQ ID NO: 124) 
MS86 : GCGGATCCGGCCTCACCCACATAGATG ( SEQ ID NO:125) 



MS97:GCGGATCCTCCACCTCCATCGTG(SEQ ID NO:126) 
MS135:CTGCTGTCGCCCNGNCCCAT(SEQ ID NO: 127) 
MS 151: ATCACGTGGGGNGCAGANACNGC (SEQ ID NO: 128) 
MSI 55 : TGTGCCTGNTTNTGGATGATG ( SEQ ID NO: 129) 



In the above sequences, the letter "N" refers to inosine. The above sequences are only illustrative and 
these base sequences are not critical. They can be modified by replacing nucleotide^) with other(s), or 
deleting or inserting nucleotide(s). The replacement may be preferably introduced within 10 bases from 5* 
terminus involving 1 to several nucleotides, more preferably, within 5 bases involving less than 5 
nucleotides. The deletion may occur in the 5' terminal region involving 4 to 5 nucleotides, preferably, within 
several bases from the 5' terminus involving a few nucleotides. In case of insertion, it may be an addition of 
8 to 12, preferably 5 to 6, more preferably, a few nucleotides in 5' terminal region. Primers MS86, MS97, 
and MS100 contains additional 8 nucleotides encoding a restriction sit at 5* terminus (MS88: 5' 
GGCTGCAG 3'; MS86, MS97 and MS100: 5' GCGGATCC 3'), however, these are not critical for the 
isolation of the desired DNA fragments. 

PCR can be conducted under appropriate conditions, for example, those described in Example 15 using 
the first complemental DNA (1st cDNA) as a template. The condition may vary depending on the primers 
used such as base sequence or combination, length to be amplified, or the like. Examples of pair of primers 
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are : MS48 - MS49; MS86 - MS100; MS97 - MS88; MS135 - MS132; MS155 - MS152; and MS151 - 
MS15& The resultant cDNA is then inserted into an appropriate site of a cloning vector such as at Smal site 
of pUCl9. A cloning vector harboring the DNA fragment is subjected to the determination "oTbase 
sequence. Generally, three clon s obtained independently are employed and the base sequence of the both 
s strands are determined to obtain an entire base sequence. The sequence is conveniently determined using 
a fluorescence sequencer GENESIS 2000 (DUPONT) according to the protocol attached thereto. Alter- 
natively, a conventional subcloning can be used when the DNA fragment consists more than 180 
nucleotides or contains a region which is hardly determined by fluorescence sequencer. Thus obtained 
base sequences of DNA fragments are shown in SEQ ID NO 33 - 39, 44 - 55, and 103 and 104. 
70 Clones N13-1, N15-1, N16 and N23 were obtained from serum of a patient N, clone 026 from patient O, 
clone U16-4 from patient U, and clone MX25 from a pool comprising sera from multiple patients. Clone of 
SEQ ID NO 37 obtained from primers MS48 and MS49 and clones of SEQ ID. Nos. 53 to 55 represent the 
same region on HCV gene (N16 region). The region of clones of SEQ ID NO 44 to 46 obtained using 
primers MS155 and MS152 on HCV gene was designated as MX25 region. In the same manner, regions of 
15 clones of SEQ ID NO 47 to 49, and regions of clones of SEQ ID NO 50 to 52, each obtained by primers 
MS151 and MS158, or MS135 and MS132, were designated as 026 and N23 regions, respectively. 

Clones N13-1, N15-1, 015-1, and 015-2 of SEQ ID NO 38, 39, 103, and 104, which were obtained by 
primers MS86 and MS100, MS97 and MS88, were designated as regions N13 and N15. 

The comparison between base sequences of each clone and known HCV gene (Kato et al., Proc. Natl. 
20 Acad. Sci. USA, 87:9524-9528 (1990); and Takamizawa et al., Journal of Virology, 65,3: 1105-1113 (1991)) 
indicates that clones align in the other of MX25, 026, N23, N16, N13 and N15, from 5 1 to 3' on the gene. 

The clone N16 of SEQ ID NO 36 was obtained by isolating independently three plasmids containing 
DNA fragment of N16 region, and determining the entire base sequence of DNA fragment originated from 
HCV. 

25 As there are overlapping region between clones, these regions were used to ligate clones each other. 

Clones are highly homologous though, they are distinguishable from each other in terms of nucleotide 
and amino acid sequences (e.g., clones of SEQ ID NO 33, 34 and 35), which indicates that one patient may 
carry more than one HCVs at the same time. It is generally accepted that core protein is well conserved 
even in HCV. When core-protein-encoding gene was cloned in the same manner as that used for the 

30 cloning of gene encoding HCV polypeptide, few variations were observed between clones. Among regions 
on HCV gene, MX25, 026, N23 and N16 regions, especially MX25 region, appear to be highly liable 
compared with core-protein-encoding region and upperstream region thereof. 

(2) Ligation of Clones of SEQ ID NO 33 to 39 

35 

cDNA clones obtained from serum of HC patients shown by sequences of SEQ ID NO 33 to 37 and 39 
were ligated in the following manners. 

1) Ligation of Clone N16 of SEQ ID NO 36 and clone N15-1 of SEQ ID NO 39 

40 

The ligation was conducted at restriction sites common to both clones. Thus, clone 16 was digested 
with restriction enzyme to cleave at the BstEII site located at nucleotide Nos. 576-582 of SEQ ID NO 36 and 
ligated to the BstEII site of clone N15-1 at Nos. 114 to 120 of SEQ ID NO 39 to obtain a DNA fragment 
consisting of DNA fragments from clones N16 and N15-1 from 5' to 3\ The resultant clones are 
45 summarized as clone of SEQ ID NO 41 . 

2) Ligation of Clones MX25 (SEQ ID NO 33) and 026 (SEQ ID NO 34) 

Clones MX25 and 026 were ligated by PCR. By this procedure, multiple DNA fragments encoding 
so different polypeptides were obtained, for example, a DNA fragment encoding a polypeptide which com- 
prises, at the N-terminal region, 284 amino acids of N- to C-termini of SEQ ID NO 33 and, from amino acid 
No. 285 to the C-terminus, amino acids from No. 32 to the C-terminus of SEQ ID NO 34; a DNA fragment 
encoding a polypeptide which comprises, at the N-t rminal region, amino acid residues of N-terminus to 
amino acid No. 252 of SEQ ID NO 33 and, from amino acid No. 253 to the C-terminus, 174 amino acid 
55 residues from N- to C- t rmini of SEQ ID NO 34. Thus obtained fused clones were inclusively shown in 
SEQ ID NO 40. 

Clones of SEQ ID NO 36 and 39 or clones of DEQ ID NO 37 and 39 can be ligated by PCR and the 
resultant clone is shown in SEQ ID NO 41 together with a base sequence obtained in the above 1). Clone 
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MX25 of SEQ ID NO 33 and clone 026 of SEQ ID NO 34, both of which contain different DNA fragments 
from those used in th above, were ligated to give multiple DNA fragments having different base 
sequences. These base sequences are summarized in SEQ ID NO 40. 

5 3) Ligation of Clones of SEQ ID NO 35 and 41 

Ligation of clones N23 and N16N15 can be conducted in the same manner as the above 1) to obtain 
various clones which are designated as N23N15 of SEQ ID NO 42 inclusively. The following illustrative DNA 
fragments were obtained: a DNA fragment encoding a polypeptide comprising, at the N-terminal region, 307 
amino acid residues from N-to C-termini of SEQ ID NO 35 (clone N23) and, from amino acid No. 308 to the 
C-terminus, amino acids from No.17 to C-terminus of SEQ ID NO 41; a DNA fragment encoding a 
polypeptide comprising, at the N-terminal region, amino acids from N-terminus to amino acid No. 291 of 
SEQ ID NO 33 and from amino acid No. 292 to the C-terminus, 477 amino acid residues from N- to C- 
terminiofSEQ ID NO 41. 

4) Ligation of Clones of SEQ ID NO 40 and 42 

Ligation of clones MX25026 and N23N15 can be conducted in the same manner as the above 1) to 
obtain clones shown in SEQ ID NO 43 inclusively. 

The protease activity of viral protein of Flavivirus, a related strain of HCV, exists in the N-terminal 
domain of non-structural protein of said virus (see, Proc. Natl. Acad. Sci. USA, 87: 8898-8902 (1990)). It is 
likely that the protease activity of HCV protein also exists in the presumed N-terminal region, NS3. It was 
confirmed that clone MX25N15 comprises the known entire amino acid sequence encoded by HCV gene 
(Kato et aL, Proc. Natl. Acad. Sci. USA, 87: 9524-9528 (1990)), and a region responsible for the protease 
activity reported by Hijikata et al (in: Congress of Japan Cancer Association (NIHON Gan-Gakkai (1991)). 

Although the both of N- and C-termini of NS3 domain of HCV protein had not been established, it can 
be presumed to be a region between amino acid Nos. 276 and 884 of SEQ ID NO 43 (clone MX25N15) on 
the basis of the primary structure of regions to be cleaved by protease and hydrophilic and hydrophobic 
patterns of Flavivirus protein, referring to a literature (Houghton et al. Hepatology, 14, 2: 381-388 (1991)). 
The presumed NS3 region of clone MX25N15 is hereinafter referred to as MK/NS3 region. 

In the same manner, the NS2 region was presumed to be a polypeptide region between amino acid 
Nos. 3 and 275 of SEQ ID NO 43 (clone MX25N15) and 40 (clone MX25026). The presumed NS2 region is 
hereinafter referred to as MK/NS2 region. 

35 (3) Expression of Polypeptides Encoded by Clones 

DNA fragments obtained in the above 1) and 2) can be used to produce a recombinant HCAg by 
constructing an expression vector containing DNA encoding a clone, by inserting the DNA into a known 
expression vector at an appropriate site of the vector, downstream from a promoter, using a well known 
40 method per se, and introducing the expression vector harboring the DNA into a host cell such as 
Escherichia coli cell, yeast cell, animal cell or the like according to the method known to one of skill, 
culturing the transformant in a medium under an appropriate condition, and recovering a product from the 
cultured broth. 

The present invention can be accomplished using any expression vectors which have a promoter at an 
45 appropriate site to direct the expression of a DNA encoding HCV polypeptide or a fragment thereof. 
Expression vectors preferably contain promoter, ribosome binding (SD) sequence, gene encoding HCV 
polypeptide, transcription termination factor, and a regulator gene. 

Expression vectors functional in microorganisms such as Escherichia coli, Bacillus subtilis or the like 
will preferably comprise promoter, ribosome binding (SD) sequence, HCV-associated-protein-encoding 
so gene, transcription termination factor, and a regulator gene. 

Examples of promoters include those derived from Escherichia coli or phages such as tryptophane 
synthetase (trp), lactose operon (lac), Xphage P L and P R , T s early gene P 2 s, P26 promoter and the like. 
These promoter may have modified or designed sequenc for each expression vector such as pac 
promoter. 

55 Although the SD s quence may be derived from Escherichia coli or phage, a s quence which has been 
designed to contain a consensus sequence consisting of more than~4 bases, which is complementary to the 
sequence at the 3' terminal region of 16S ribosome RNA, may also be used. 

The transcription termination factor is not essential. However, it is preferable that an expression vector 
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contains a p-independent factor such as lipoprotein terminator, trp operon terminator or the like. 

Preferably, these sequences required for the expression of a gene encoding HCAg originated from HCV 
are located, in an appropriate expression plasmid, in the order of promoter, SD sequence. HCV-associated- 
protein-encoding gene and transcription t rmination factor from 5' to 3' direction. 

Typical example of expression vectors is commercially available pKK233-2 (Pharmacia). However, a 
series of plasmids pGEX (Pharmacia), which are provided for the expression of fused protein, are also 
employable for the expression of HCAg-encoding gene of the present invention. 

A suitable host cell such as Escherichia coli can be transformed with an expression vector comprising a 
DNA of the invention by any of known methods such as protocol provided by TOYOBO Japan as described 
in Example 16. 

The cultivation of the transformants can be carried out using any of well known procedures in literatures 
such as Molecular Cloning, 1982, and the like. The cultivation is preferably conducted at a temperature from 
about 28 - Cto42'C. 

Expression vectors used for transforming other host cells, such as those derived from insects or 
animals including mammals, consist of substantially the same elements as those described in the above. 
However, there are certain preferable factors as follows. 

When insect cells are used, a commercially available kit, MAXBAC™ is employed according to the 
teaching of the supplier (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4). In 
this case, it is desirable to make a modification to reduce the distance between the promoter of polyhedrin 
gene and the initiation codon so as to improve the expression of the gene. 

When animal cells are used as hosts, expression vectors preferably contain active-type promoter from 
adenovirus EIA gene (ZOKUSEIKAGAKU JIKKEN KOZA I, IDENSHI KENKYU-HO II, 189-190. 1986), SV40 
early promoter, SV40 late promoter, apolipoprotein E gene promoter, SRa promotor (Molecular and Cellular 
Biology, 8, 1, 466-472, 1988) or the like. Specifically, known expression vectors such as pKCR (Proc. Natl. 
Acad. Sci. USA, 78: 1528 (1981)) or a derivative thereof, pBPV MT1 (Proc. Natl. Acad. Sci. USA, 80: 398 
(1983)), which prepared by modifying pKCR maintaining its essential functions, pBPV MT1 (ProcT Natl. 
Acad. Sci. USA, 80: 398 (1983)), or the like may be employed. 

Animal cells usable in the present invention are CHO cell, COS ceil, mouse L cell, mouse C127 cell, 
mouse FM3A cell and the like. 

A clone of the invention can be inserted into an expression vector for procaryotic cells such as Ecoli or 
eucaryotic cells such as animal cells after modifying the DNA sequence to bring it in conformity with a 
frame of initiation codon of said vector. Alternatively, an initiation codon is added at the 5' terminus of DNA 
so as to an appropriate translational frame can be produced. The term "a translational frame" of a clone 
refers to a frame of a base sequence in which bases are described as triplets capable of encoding amino 
acid sequence as illustrated in SEQ ID NO 33 to 43. 

The recombinant polypeptide expressed by the host cells such as microorganisms including E. coli, 
insect cells and animal cells can be recovered from the cultured broth by known methods and identiflkfby, 
for example, immunoreactions between the expressed product and antiserum obtained from HC patients 
using a conventional method such as Western blot analysis. 

Hydrophilic study and prediction of higher-order structure of protein, the following peptide fragments 
contained in a polypeptide having amino acid sequence of SEQ ID NO 43 appeared to be highly hydrophilic 
and can take so-called "turn structure" not a-helix or j9-sheet structure in high probability. Therefore, these 
fragments possibly represent antigen determinants, or can contain at least one antigen determinant of 
HCAg. Although the higher-order structure in serum and the specific reactivity of each fragment are not 
established, it can be concluded that the following peptide fragments are highly reactive with antiserum 
raised against HCV-associated antigens. A polypeptide consisting of 19 amino acids from amino acid Nos. 
247 to 265 of SEQ ID NO 43; a polypeptide consisting of 8 to 25 amino acids subject to that it contains at 
least 8 amino acids from Nos. 300 to 307; a polypeptide consisting of 13 to 25 amino acids subject to that it 
contains at least 13 amino acids from Nos. 410 to 428; a polypeptide consisting of 10 amino acids from 
Nos. 283 to 292; a polypeptide consisting of 14 amino acids from Nos. 477 to 490; a polypeptide consisting 
of 14 amino acids from Nos. 498 to 512; a polypeptide consisting of 12 amino acids from Nos. 538 to 550; 
a polypeptide consisting of at least 21 amino acids from Nos. 747 to 767; a polypeptide consisting of at 
least 12 amino acids from Nos. 841 to 852; a polypeptide consisting of at least 12 amino acids from Nos. 
867 to 878; a polypeptide consisting of 8 to 25 amino acids subject to that it contains at least 8 amino acids 
from Nos. 665 to 672; and a polypeptide consisting of 15 amino acids from Nos. 315 to 327. 

The above polypeptide fragments can be obtained by means of chemical synthesis, as well as DNA 
recombinant technique. 

Other polypeptide fragments of clone of SEQ ID NO 43, that is, a polypetides containing the entire or a 



15 



EP 0 518 313 A2 



part of a polypeptide consisting of 266 amino acids from Nos. 461 to 726; a polypeptide consisting of 74 
amino acids from Nos. 477 to 550; a polypeptide consisting of 42 amino acids from Nos. 963 to 1004; and a 
polypeptide consisting of 45 amino acids from Nos. 283 to 327, can be prepared in a large scale by 
recombinant DNA technique. 

[4] Gene Encoding NS4 to NS5 Regions 

(1 ) Preparation of cDNA clone of SEQ ID NO 64 - 75 and sequencing th ereof 

The cDNA clones of SEQ ID NO 64 - 75 which encode a novel polypeptide of NS4 to NS5 regions of 
HCV protein and fragments thereof were cloned from serum from HC patients as follows. 

The cloning and sequencing of cDNA encoding HCV polypeptide can be carried out using any of known 
methods. However, it is hardly accomplished by known "Okayama-Berg" or "Gutter-Hoffman" method 
because the content of HCV in serum is only a slight amount and HCV gene is liable to variation. The 
present inventors succeeded in the cloning of gene from a slight amount of serum as will be hereinafter 
described in Example 1. Briefly, it was conducted by extracting nucleic acids from a serum of a patient 
suffering from HC. It is preferable to use serum showing OD value of 3.5 on a screening kit of Ortho. Before 
the extraction, it is desirable to add tRNA or polyribonucleoside to the serum as a carrier for viral RNA. For 
the purpose of the invention, tRNA is preferable because the degradation of RNA can be easily detected, at 
least after the addition of tRNA, by monitoring the existence of a sufficient amounts of tRNA having an 
intact length on electrophoresis. 

The resultant RNA is converted into cDNA using transcriptase in the presence of an appropriate 
oligonucleotide primer. The cDNA is then cloned and amplified by means of polymerase chain reaction 
(PCR) (Saiki et al., Nature 324: 126 (1986)) in the presence of a pair of primers. Although commercially 
available random primers can be used in the PCR, synthetic primers having the following base sequences 
are suitable for the present invention. 

Synthetic Primers for Cloning of HCV Gene 



16 



EP 0 518 313 A2 



5' 3' 
MS 126: GGTGAGCATGGAGGTGACCAC ( SEQ ID NO: 130) 
MS119:TCATCCTCCTCCGCTCGAAGC(SEQ ID NO:131) 
MSI 6 1 : GTGGACGCCTTNGCCTTCATNTC ( SEQ ID NO:132) 
MSI 62: ACGGATGTCNTTCTCNGTNAC (SEQ ID NO:133) 
MS 121: GGCGGAATTCCTGGTCATAGCCTCCGTGAA ( SEQ ID NO: 134) 
MSI 6 3 : GGGGNATGGCCTATTGGCCTG ( SEQ ID NO: 135) 
MS 127: GGCATGTGGGCCCAGGGGAGG ( S EQ ID NO: 136) 
MS 1 1 8 : TGTGAGCCCGAACCGGATGT (SEQ ID NO: 137) 
MS159 :GTGGTANTCCTGGACTCNTTNGA(SEQ ID NO: 138) 
MSI 60 :ACTACCGNGACGTGCTNAANGA(SEQ ID NO: 139) 
MS 120: TGGGGATCCCGTATGATACCCGCTGCTTTG (SEQ ID NO:140) 
MS 174: ATTGTCAGATCTACGGGGCCACTT (SEQ ID NO: 141) 

MS 175: GCAAGCTTAAAAAAAAAAAAGGGGGATGGCCTATTGGCCTGGA ( S EQ ID 
NO: 142) 



In the above sequences, the letter "N" refers to inosine. The above sequences are only illustrative and 
these base sequences are not critical. They can be modified by replacing nucleotide^) with other(s), or 
deleting or inserting nucleotide(s). The replacement may be preferably introduced within 10 bases from 5' 
terminus involving 1 to several nucleotides, more preferably, within 5 bases involving less than 5 
nucleotides. The deletion may occur in the 5* terminal region involving 4 to 5 nucleotides, preferably, within 
several bases from the 5* terminus involving a few nucleotides. In case of insertion, it may be an addition of 
8 to 12, preferably 5 to 6, more preferably, a few nucleotides in 5' terminal region. 

PCR can be conducted under appropriate conditions, for example, those described in Example 21 using 
the first complemental DNA (1st cDNA) as a template. The condition may vary depending on the primers 
used such as base sequence or combination, length to be amplified, or the like. Examples of pair of primers 
are : MS127 - MS126; MS118 - MS119; MS159 - MS161; MS160 - MS162; MS120 - MS163; and MS120 - 
MS121. The resultant cDNA is then inserted into an appropriate site of a cloning vector such as at Smal site 
of pUCl9. A cloning vector harboring the DNA fragment is subjected to the determination~oTbase 
sequence. Generally, three clones obtained independently are employed and the base sequence of the both 
strands are determined to obtain an entire base sequence. The sequence is conveniently determined using 
a fluorescence sequencer GENESIS 2000 (DUPONT) according to the protocol attached thereto. Alter- 
natively, a conventional subcloning can be used when the DNA fragment consists more than 180 
nucleotides or contains a region which is hardly determined by fluorescence sequencer. Thus obtained 
base sequences of DNA fragments are shown in SEQ ID NO 64 - 69, and 76 - 100. 

Clones N22-1, 3, N17-1, 2, 3, N29-1, 2, 3, N18-2, 3 and 4 wer obtained from serum of a patient N, 
clone H22-3, 8, 9. H17-1, 3, H18-1, 2 and 3 from patient H, clone 028-1, 2, 4, O30-2, 3 and 4 from patient 
O. It is generally accepted that region encoding core protein or its 5* region generally contain few variations 
and are well conserved even in HCV. When regions encoding core protein and/or a upstream region thereof 
were cloned in the same manner as the above, variations were hardly observed between clones. In the 
present invention, clones obtained from a same region on HCV gene were highly homologous though, they 
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proved to be DNA fragments distinguishable from each other in terms of nucleotide and amino acid 
sequences. This indicates that one patient may carry more than one HCVs at the same time. 

From the above fact, N22, N17, 028, N18, N29, and 030 regions assumed to be highly liable compared 
with core-protein-encoding region and upstream region thereof. 

s Region on HCV gene which corresponds to each clone was designated as follows. The region of clones 
N22-1, 3, H-22, 3, 8 and 9 obtained using primers MS127 and MS126 was designated as N22. In the same 
manner, the regions on HCV gene corresponding to clones N17-1, 2, 3, H17-1 and 3 obtained using primers 
MS118 and MS119, clones 028-1, 2 and 4 obtained using primers MS159 and MS1 61, clones N29-1, 2 and 
3 obtained using primers MS160 and MS162, clones N18-2, 3, 4, H18-1, 2 and 3 obtained using primers 

w MS120 and MS121, clones O30-2, 3 and 4 obtained using primers MS120 and MS163 were designated as 
N17, 028, N29, N18 and 030, respectively. 

The comparison between base sequences of each clone and known HCV gene (Kato et al., Proc. Natl. 
Acad. Sci. USA, 87:9524-9528 (1990); and Takamizawa et al., Journal of Virology, 65,3: 1105-1113 (1991)) 
indicates that clones align in the other of N22, N 17, 028, N29, and 030, from 5' to 3' on the gene (N18 is 

75 included in 030 region). 

There are overlapping region between clones, which were used to ligate clones each other. 

(2) Ligation of Clones of SEQ ID NO 64 to 69 

20 Regions N22, N17, 028, N29. and 030 of done N15 (see, the above [3]), a cDNA clone obtained from 
serum of HC patients, were ligated in the following manners. 

1) Ligation of N17 and 028 Regions 

25 The ligation of N17 and 028 regions can be conducted using, for instance, clones N17-3 (SEQ ID NO 
81) and 028-1 (SEQ ID NO 86). The ligation was carried out by PCR. Thus, about equimolar of DNA 
fragments (as template) of clones N17-3 and 028-1 in a solution were subjected to PCR in the presence of 
primers MS118and MS161 to yield clone 1728. 

30 2) Ligation of N29 and N18 Regions 

In the same manner as the above 1), N29 and N18 regions were ligated using clones N29-1 (SEQ ID 
NO 89) and N18-4 (SEQ ID NO 92), and primers MS160 and MS121 to yield clone 2918. 

35 3) Ligation of Regions N 17 to N18 

PCR was carried out using DNA fragments of clones 1728 and 2918, primers MS118 and MS121 to 
yield clone 1718 which contains clones N17, 023, N29, N18 from 5' to 3*. The clone 1718 was cloned into 
Smal site of PUC19 to give plasmid 1718 in which EcoRI site from pUC19. clone N17-3 and N18 regions on 
40 HCV gene are aligned in this order from 5' to 3\ 

4) Ligation of Regions N 22 to N17 

In the same manner as the above 1), DNA fragments of clones N22-1 (SEQ ID NO 76) and N17-3 (SEQ 
45 ID NO 81) were ligated by PCR using primers WS127 and MS119 to yield a DNA fragment designated as 
clone 2217 which contains N22 and N17 from 5' to 3'. The clone 2217 was cloned into Smal site of pUC19 
in the same manner as the above 3) to give plasmid 2217 in which EcoRI site located at sHerminus. 



5) Ligation of Clones 2217 and 1718 

so 

Upon digestion with restriction enzyme Xba l, clone 1718 is cleaved at one site. Plasmid pl)Cl718 was 
digested with Xba l and a DNA fragment comprising DNA fragment of clone 1718 and Xbal sit of pUC19 
was isolated. The DNA fragm nt deriv d from clone 2217 was inserted into Xbal site of pDC2217 such that 
the Xbal site in N17 region of pUC2217 and Xbal site from pUC19 are ligatecTtoobtain plasmid pUC2218. 

55 

6) Ligation of N15 Region and 030 Region Corresponding to 3* Terminal Region of HCV Gene 



An example of DNA fragment of 030 region is clone O30-3 of SEQ ID NO 98. Plasmid pUCO30 
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contains the DNA fragment of O30-3 at Smal site of pUC19 in the order of, from 5' to 3\ EcoRI site and 
clone O30-3. Plasmid pUCN15 contains a DNA fragment of HCV gene, clone N15 (see, [3])77orwardly at 
Smal site of pUC19 in the order of, from 5* to 3', EcoRI site and clone N15. 

Plasmid pUCO30 was cleaved by Sacl and blunt ended, which was followed by th cleavag at another 
cloning site, Hindlll, to isolate a DNA fragm nt derived from HCV gene, which was ligated to a DNA 
fragment from plasmid pUCN15 which was digested with Xbal, blunt ended, digested with Hindlll to yield 
plasmid pUC15-30. Taking advantage of the fact that said "plasmid pUC15-30 has only one site which can 
be cleaved by restriction enzymes Bglll and Hindlll, it was subjected to PCR using a primer MS174 having 
a Bglll site in sequence derived from clone O30-3 in order to add poly U at 3' terminus of clone 030-3. 

PCR was conducted using, as a template, PUC15-30 and primers MS174 and MS175. PCR fragment 
was then digested with Bglll and Hindlll and the resultant fragment ligated to a Bglll-Hindlll fragment of 
pUCO30 containing the vector fragment of pUCO30 to obtain plasmid pUCl5-30U having"pblyU attached to 
the 3' terminus of clone 030-3. 

7) Ligation of N15 to O30 Regions 

There is an Apa l site within a region common to N15 and N22 regions. There is an Apal site within a 
region common to N18 and O30. A DNA fragment isolated from pUC2218 with Apal was inserted into Apal 
site of pUC15-30U appropriately to obtain plasmid pUC1530U. 

The ligated N15 to 030 regions encodes amino acid sequence which is highly homologous to amino 
acid sequence of NS5, a part of non-structural protein NS4 of Flavivirus, a related strain of HCV. It was also 
confirmed that said region is homologous to a sequence encoding a part of NS4 region AND NS5 region by 
comparison with a known sequence of HCV gene disclosed by aforementioned Chiron, Shimotohno, or 
Takamizawa. As a conclusion, clone disclosed in Seq. Lis. represents DNA sequence assumed to be NS4 
and NS5 regions of HCV gene. The clone was then inserted into an expression plasmid to produce 
polypeptide encoded by said clone. The polypeptide was then evaluated as to the ability to react 
immunologically with antiserum of HC patients. 

(3) Expression of Polypeptides Encoded by Clones 

DNA fragments obtained in the above (2) can be used to produce a recombinant HCAg by constructing 
an expression vector containing DNA encoding a clone, by inserting the DNA into a known expression 
vector at an appropriate site of the vector, downstream from a promoter, using a well known method per se, 
and introducing the expression vector harboring the DNA into a host cell such as Escherichia coli cell, yeast 
cell, animal cell or the like according to the method known to one of skill, culturing the transformant in a 
medium under an appropriate condition, and recovering a product from the cultured broth. 

The present invention can be accomplished using any expression vectors which have a promoter at an 
appropriate site to direct the expression of a DNA encoding HCV polypeptide or a fragment thereof. 
Expression vectors preferably contain promoter, ribosome binding (SD) sequence, gene encoding HCV 
polypeptide, transcription termination factor, and a regulator gene. 

Expression vectors functional in microorganisms such as Escherichia coli, Bacillus subtilis or the like 
will preferably comprise promoter, ribosome binding (SD) sequence, HCV-associated-protein-encoding 
gene, transcription termination factor, and a regulator gene. 

Examples of promoters include those derived from Escherichia coli or phages such as tryptophane 
synthetase (trp), lactose operon (lac), Xphage P L and P R , T 5 early gene P 25 , P 2 e promoter and the like. 
These promoter may have modified or designed sequence for each expression vector such as pac 
promoter. 

Although the SD sequence may be derived from Escherichia coli or phage, a sequence which has been 
designed to contain a consensus sequence consisting of more than~4 bases, which is complementary to the 
sequence at the 3' terminal region of 16S ribosome RNA, may also be used. 

The transcription termination factor is not essential. However, it is preferable that an expression vector 
contains a p-independent factor such as lipoprotein terminator, trp operon terminator or the like. 

Preferably, these sequences required for th xpression of a gene encoding HCAg originat d from HCV 
are located, in an appropriate expression plasmid, in the order of promoter, SD sequence, said gene and 
transcription termination factor from 5* to 3' direction. 

Typical example of expression vectors is commercially availabl pKK233-2 (Pharmacia). However, a 
series of plasmids pGEX (Pharmacia), which are provided for the expression of fused protein, are also 
employable for the expression of HCAg-encoding gene of the present invention. 
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A suitable host cell such as Esch richia coli can b transformed with an expression vector comprising a 
DNA of the invention by any of known methods such as protocol provided by TOYOBO Japan as described 
in Example 22. 

The cultivation of the transformants can b carried out using any of well known proc dur s in literatur s 
5 such as Molecular Cloning, 1982, and the like. The cultivation is preferably conducted at a temperature from 
about 28*Cto 42 'C. 

Expression vectors used for transforming other host cells, such as those derived from insects or 
animals including mammals, consist of substantially the same elements as those described in the above. 
However, there are certain preferable factors as follows. 
w When insect cells are used, a commercially available kit, MAXBAC™ is employed according to the 
teaching of the supplier (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4). In 
this case, it is desirable to make a modification to reduce the distance between the promoter of polyhedrin 
gene and the initiation codon so as to improve the expression of the gene. 

When animal cells are used as hosts, expression vectors preferably contain active-type promotor from 
75 adenovirus EIA gene (ZOKUSEIKAGAKU JIKKEN KOZA l f IDENSHI KENKYU-HO II, 189-190, 1986), SV40 
early promoter, SV40 late promoter, apolipoprotein E gene promoter, SRa promotor (Molecular and Cellular 
Biology, 8, 1, 466-472, 1988) or the like. Specifically, known expression vectors such as pKCR (Proc. Natl. 
Acad. Sci. USA, 78: 1528 (1981)) or a derivative thereof, pBPV MT1 (Proc. Natl. Acad. Sci. USA, 80: 398 
(1983)), which prepared by modifying pKCR maintaining its essential functions, pBPV MT1 (ProcTNatl. 
20 Acad. Sci. USA, 80: 398 (1983)). or the like may be employed. 

Animal cells usable in the present invention are CHO cell, COS cell, mouse L cell, mouse C127 cell, 
mouse FM3A cell and the like. 

A clone of the invention can be inserted into an expression vector for procaryotic cells such as E.coli or 
eucaryotic cells such as animal cells after modifying the DNA sequence to bring it in conformity with a 
25 frame of initiation codon of said vector. Alternatively, an initiation codon is added at the 5' terminus of DNA 
so as to an appropriate translational frame can be produced. The term "a translational frame" of a clone 
refers to a frame of a base sequence in which bases are described as triplets capable of encoding amino 
acid sequence as illustrated in SEQ ID NO 64 to 75. 

The recombinant polypeptide expressed by host cells such as microorganisms including E. coli, insect 
30 cells and animal cells can be recovered from the cultured broth by known methods and identified by, for 
example, immunoreactions between the expressed product and antiserum obtained from HC patients using 
a conventional method such as Western blot analysis. 

Hydrophilic study and prediction of higher-order structure of protein, the following peptide fragments 
contained in a polypeptide having amino acid sequence of SEQ ID NO 75 appeared to be highly hydrophilic 
35 and can take so-called "turn structure" not a-helix or 0-sheet structure in high probability. Therefore, these 
fragments possibly represent antigen determinants, or can contain at least one antigen determinant of 
HCAg. Although the higher-order structure in serum and the specific reactivity of each fragment are not 
established, it can be concluded that the following peptide fragments are highly reactive with antiserum 
raised against HCV-associated antigens. A polypeptide comprising at least 20 amino acids from amino acid 
40 Nos. 324 to 343; a polypeptide comprising at least 14 amino acids from Nos. 356 to 369; a polypeptide 
comprising at least 18 amino acids from Nos. 584 to 601; a polypeptide comprising 10 amino acids from 
Nos. 588 to 597; a polypeptide consisting of 10 amino acids from Nos. 620 to 629; a polypeptide consisting 
oi 18 amino acids from Nos. 901 to 918; and a polypeptide which contains at least any of those described 
in the above and comprises 25 or less amino acids of SEQ ID NO 75. 
45 The above polypeptide fragments can be obtained by means of chemical synthesis, as well as DNA 
recombinant technique. 

Other polypeptide fragments of SEQ ID NO 75, that is, a polypetides containing the entire or a part of a 
polypeptide consisting of 74 amino acids from Nos. 413 to 486; a polypeptide consisting of 997 amino 
acids from Nos. 415 to 1411; a polypeptide consisting of 74 amino acids from Nos. 655 to 728; a 
so polypeptide consisting of 98 amino acids from Nos. 858 to 955; a polypeptide consisting of 92 amino acids 
from Nos. 1009 to 1100; a polypeptide consisting of 66 amino acids from Nos. 1160 to 1225; and a 
polypeptide consisting of 54 amino acids from Nos. 763 to 816 can be prepared in a large scale by 
recombinant DNA technique. 

55 [5) Pr paration of a cDNA Clone T7N1-30U Originated from Serum of HC Patient (SEQ ID NO 101) 

The gene or a DNA fragment encoding a novel polypeptide of SEQ ID NO 101 can be obtained by 
following procedures. 
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The ligation of clones N19MX24-A-1 and MX25-1 by PCR gives a DNA fragment in which either of the 3' 
sequence of MX24 region and 5' sequence of MX25 region, which are overlapping each other, is 
preferentially used (Clone 1925). A synthetic DNA was synthesized in order to introduce into clone N1-1, 
from 5' to 3\ restriction sites Hindlll and Spel and T7 promoter and clone T7N1-1 was obtained by cassette 
5 ligation. Clone T7N1N3N10 was obtain d in the same manner as that used for the preparation of clone 
N1N3N10 except that clone T7N1-1 was used instead of clone N1-1. This clone was ligated to clone 
N27N19-1 by restriction enzyme Bam HI to obtain clone T7N119. The clones T7N119 and 1925 have N19 
regions and the both clones were ligated using Pvul restriction site in the N19 region to yield clone T7N1- 
25. 

70 A EcoRI-Notl- Bam HI adapter (Toyobo) was ligated to plasmid pUCl530U at the Hindlll site in its 3' 
terminal region to obtain Clone 153DUNot which contains Notl site at 3' terminus of cloneT530U. 

For the ligation of clones T7N1-25, 1530UNot, and MX25N15-1, prepared in [3], the three clones were 
ligated at PstI site in MX25 region common to clones T7N1-25 and MX25N15-1 and EcoT22l site in N15 
region common to clones 1530UNot and MX25N 15-1. Clone T7N1-25 has Spel site at S^minus and clone 

15 1 530UNot has Notl site at 3' terminus. 

HCV gene can be prepared by ligating clones T7N1-25, MX25N15-1 and 1530UNot in this order without 
overlapping. Thus, clone T7N1-25 is digested with Spec ! and PstI, clone MX25N15-1 with PstI and 
EcoT221 , clone 1530UNot with EcoT22l and Notl, XZapll (Strategene)~with Sped and Notl, respectively, and 
the resultant fragments were ligated to yield a phage in which a single DNAlrigmentTiaving a sequence of 

20 HCV gene between Spel and Notl sites of XZapll (from 5' to 3': clone T7N1-25, MX25N15-1 and 1530UNot). 
The resultant HCV derived clone was designated as T7NI-30U. Ligation to XZapll (Strategene), isolation of 
phage DNA, subcloning into pBluescriptll can be conducted according to the protocol attached to the kit 
The packaging for the preparation of phage particles were carried out using Gigapack II Packaging Extracts 
(Strategene) according to the protocol attached thereto. The clone T7N1-30U is a DNA fragment which 

25 comprises a cDNA originated from HCV having an inserted T7 phage promoter at 5' terminus, and poly T at 
3 1 terminus. 

[6] Expression of Fused Polypeptides Encoded by cDNA Originated from Serum of H C Patients 

30 Recombinant HCV-associated antigen can be obtained by expressing all or a part of clones prepared in 
[1]. 12], [3] or [4], or DNA sequence encoding all the protein of HCV prepared in [5]. 

The present invention can be accomplished using any expression vectors which have a promoter at an 
appropriate site to direct the expression of a DNA encoding HCV polypeptide or a fragment thereof. 
Expression vectors preferably contain promoter, ribosome binding (SD) sequence, gene encoding HCV 
35 polypeptide, transcription termination factor, and a regulator gene. 

Expression vectors functional in microorganisms such as Escherichia coli, Bacillus subtilis or the like 
will preferably comprise promoter, ribosome binding (SD) sequence, HCV-associated-protein-encoding 
gene, transcription termination factor, and a regulator gene. 

Examples of promoters include those derived from Escherichia coli or phages such as tryptophane 
40 synthetase (trp), lactose operon (lac), Xphage P L and P R , T 5 early gene P 2 s. P26 promoter and the like. 
These promoter may have modified or designed sequence for each expression vector such as pac 
promoter. 

Although the SD sequence may be derived from Escherichia coli or phage, a sequence which has been 
designed to contain a consensus sequence consisting of more than~4 bases, which is complementary to the 
45 sequence at the 3' terminal region of 16S ribosome RNA t may also be used. 

The transcription termination factor is not essential. However, it is preferable that an expression vector 
contains a p-independent factor such as lipoprotein terminator, trp operon terminator or the like. 

Preferably, these sequences required for the expression of a gene encoding HCAg originated from HCV 
are located, in an appropriate expression plasmid, in the order of promoter, SD sequence, said gene and 
so transcription termination factor from 5' to 3' direction. 

Typical example of expression vectors is commercially available pKK233-2 (Pharmacia). However, a 
series of plasmids pGEX (Pharmacia), which are provided for the expression of fused protein, are also 
employable for th expression of H(XAg-encoding gen of the pres nt invention. 

A suitable host cell such as Escherichia coli can be transformed with an expression vector comprising a 
55 DNA of the invention by any of knovn methods such as protocol provided by TOYOBO Japan as described 
in Example 30. 

The cultivation of the transformants can be carried out using any of well known procedures in literatures 
such as Molecular Cloning, 1982, and the like. The cultivation is preferably conducted at a temperature from 
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about 28*Cto42'C. 

Expression vectors used for transforming oth r host cells, such as those derived from insects or 
animals including mammals, consist of substantially the same elements as those described in the above. 
However, there are certain pr ferable factors as follows. 
5 When insect cells are used, a commercially available kit, MAXBAC ™ is employed according to the 
teaching of the supplier (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4). In 
this case, it is desirable to make a modification to reduce the distance between the promoter of polyhedrin 
gene and the initiation codon so as to improve the expression of the gene. 

When animal cells are used as hosts, expression vectors preferably contain active-type promotor from 
w adenovirus EIA gene (ZOKUSEIKAGAKU JIKKEN KOZA I, IDENSHI KENKYU-HO II, 189-190, 1986), SV40 
early promoter, SV40 late promoter, apolipoprotein E gene promoter, SRa promotor (Molecular and Cellular 
Biology, 8, 1, 466-472, 1988) or the like. Specifically, known expression vectors such as pKCR (Proc. Natl. 
Acad. Sci. USA, 78: 1528 (1981)) or a derivative thereof, pBPV MT1 (Proc. Natl. Acad. Sci. USA, 80: 398 
(1983)), which prepared by modifying pKCR maintaining its essential functions, pBPV MT1 (ProcTNatl. 
75 Acad. Sci. USA, 80: 398 (1983)), or the like may be employed. 

Animal cells usable in the present invention are CHO cell, COS cell, mouse L cell, mouse C127 cell, 
mouse FM3A cell and the like. 

A clone of the invention can be inserted into an expression vector for procaryotic cells such as E.coli or 
eucaryotic cells such as animal cells after modifying the DNA sequence to bring it in conformity~with a 
20 frame of initiation codon of said vector. Alternatively, an initiation codon is added at the 5' terminus of DNA 
so as to an appropriate translations! frame can be produced. The term "a translational frame" of a clone 
refers to a frame of a base sequence in which bases are described as triplets capable of encoding amino 
acid sequence as illustrated in SEQ ID NO 1 to 104. 

The recombinant polypeptide expressed by host cells such as microorganisms including E. coli, insect 
25 cells and animal cells can be recovered from the cultured broth by known methods and identified by, for 
example, immunoreactions between the expressed product and antiserum obtained from HC patients using 
a conventional method such as Western blot analysis. 

Polypeptide encoded by gene of the invention contains region(s) which seem to be immunologically 
highly reactive with antiserum of HC. These regions were ligated and expressed in various cells as fused 
30 protein. For example, polypeptide having amino acids from Nos. 1 to 115 of SEQ ID NO 3 was expressed 
using expression vector pCZCORE. The expression vector was modified to replace the 3' region from the 
epitopic region of said polypeptide with clone N23 which encodes a desired polypeptide to express a fused 
protein. It was followed by the ligation of a polypeptide having amino acids from Nos. 963 to 1005 of SEQ 
ID NO 43 to the C-terminus of polypeptide encoded by N23 region. Thus, regions encoding polypeptides 
35 which seem to be immunologically highly reactive with antiserum of HC patients were ligated to cDNA and 
inserted into an expression vector to express said polypeptides. 

Specifically, as shown in Example 30, a polypeptide CN23 which contains an epitopic region of core 
protein of HCV and a region comprising an epitope which is encoded by clone N23, a part of non-structural 
protein region NS3 and is seem to be immunologically highly reactive with antiserum of HC patients, was 
40 expressed directly in E. coli 

Thus, clone N23, from No. 107 (G), was inserted in frame into pCZCORE at the Sacll site within core 
gene. Expression vector pCZCN23 capable of expressing epitopic regions of core proteirTand a polypeptide 
encoded by N23 as a fused protein was constructed by ligating a part of N23 to the 3' terminus of the N- 
terminal gene of core protein. A DNA fragment which encodes HCV protein and has SD sequence at 5' 
45 terminus was ligated in tandem to the vector, resulting in the expression of desired polypeptide in large 
scale. 

The resultant fused protein comprising epitopic regions of core protein and N23 region reacted with 
antiserum of HC patient in high probability. 

Thus, the present invention provides a novel gene of HCV or a fragment thereof and polypeptide 
so encoded by the same. The recombinant polypeptide is highly reactive with HCAb and can be used for the 
development of a method for detecting HCAb efficiently, and for the preparation of vaccine. DNA and 
polypeptides of the invention are also useful for the development of in vivo or in vitro system for the 
estimation of protease activity of HCV. 

The following Examples further illustrate and detail the invention disclosed, but should not be construed 
55 to limit the invention. Throughout the Examples concerning the isolation of RNA and cloning of cDNA, tip or 
pipet used for the preparation of samples and/or reagents employed for reaction was changed to cleaned 
and/or sterilized one every time for preventing the sample from contamination. The procedures which are 
not specifically described were conducted substantially in accordance with the teachings of literatures given 
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in parentheses. 

Electrophoresis of nucleic acids (Molecular Cloning (1982), Cold Spring Harbor): cleavage of DNA 
fragment with restriction enzymes <Molecular Cloning (1982), Cold Spring Harbor); or a catalogue 
"IDENSHIKOGAKU KENKYU-YO SIYAKU SOGO KATALIOGU", Toyobo): ligation reaction of DNA fragments 
5 (TAKARA Biotechnology Catolog, 1991, vol. 1, Takara Shuzo): extraction of DNA from acrylamide gel or 
agarose gel (Molecular Cloning (1982), Cold Spring Harbor): cultivation of E. coli transformants transformed 
with a plasmid on agarose plate and isolation of colony therefrom (Molecular Cloning (1982), Cold Spring 
Harbor). 

w Example 1 



Extraction of Nucleic Acids from Serum of a Patient Suffering from Hepatitis C 

To 10 ml of a serum from a patient of HC (OD = 3.5 or more on HCV EIA kit of Ortho & Co.) was 
75 added 25 ml of Tris buffer (50 mM Tris-HCI (pH 8.0), 1 mM EDTA, 100 mM NaCI), mixed and centrifuged 
(20,000 x g, 20 min) at 20 °C. The supernatant was centrifuged (100,000 x g, 5 hr) at 20 °C. The pellet was 
dissolved into 1.5 ml of Protenase K solution (1% sodium dodecyl sulfate, 10 mM EDTA, 10 mM Tris-HCI 
(pH 7.5), 2 mg/ml Protenase K (Pharmacia), and 6.6 ug yeast tRNA mixture) and the solution incubated at 
45 °Cfor 90 min. The solution was then subjected to the phenol/chloroform extraction (more than 4 times) 
20 which was carried out by adding an equal volume of phenol/chloroform to the solution, vigorously mixing, 
and centrifuging to recover the aqueous layer containing nucleic acids. It was followed by chloroform 
treatments (more than two times) and ethanol precipitation. The ethanol precipitation was carried out by 
mixing the aqueous solution with 2.5 volumes of ethanol containing either of 1/10 volume of 3M sodium 
acetate or an equal volume of 4 M ammonium acetate, allowing to stand for overnight at -20 °C, or more 
25 than 15 min at -80 °C, centrifuging (35,000 rpm, 4 hr) by SW41 Ti Rotor (Beckman) to pellet nucleic acids, 
and recovering the pellet. The pellet of nucleic acid was then dried for the subsequent use. 

Example 2 
30 Synthesis of cDNA 

[1] Preparation of RNA Sample Solution 

RNA sample solution was prepared by resolving the dried nucleic acid obtained in Example 1 in 30 ul 
35 of water containing 10 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo, Japan). 

[2] Synthesis of cDNA Using Random Primer 

To 2 ul of RNA sample solution was added 2.7 ul of random primer (0.1 70D, Amersham), 2 ul of 10 x 

40 PCR (Mg) buffer (100 mM Tris-HCI (pH 8.3), 500 mM KCI, 60 mM MgCI 2 ), 8 ul of 1.25 mM 4dNTPs, 2 ul of 
water and the mixture incubated at 65 °C for 5 min then at 25 °C for 5 min. To the mixture was added I ul 
of reverse transcriptase (25 U, Life Science), 1 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo) and the 
mixture incubated at 37 °C for 20 min, at 42 °C for 30 min, and finally at 95 °C for 5 min, which was 
followed by prompt cooling to 0 °C (synthesis of cDNA). 

45 Amplification of DNA having specific sequences was conducted substantial in accordance with the 
polymerase chain reaction (PCR) of Saiki et al. (Nature 324: 126 (1986)). Throughout the specification, the 
expression that PCR was carried out according to Saikri" method means that the PCR was conducted 
substantial in accordance with the polymerase chain reaction (PCR) of Saiki et al. For the PCR, a 100 ul of 
a mixture containing 2 ul of cDNA solution, 10 ul of 10 x PCR buffer (100 mM Tris-HCI (pH 8.3), 500 mM 

so KCI, 150 mM MgCfe, 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 50 pmol each of two synthetic primers (the pair 
of primers consists of S1 - AS1, S2 • AS1 , S2 - AS2, or S4 - AS3) and water was incubated at 95 °C for 5 
min, then immediately cooled to 0 °C. One minute later, it was mixed with 0.5 ul of Taq DNA polymerase (7 
U/ul, AmpliTaq™, Takara Shuzo) and overlaid with mineral oil. The resultant sample was then subjected to 
PCR. PCR was conducted by repeating 25 times of reaction cycle, which comprises the following 

55 treatments: at 95 °C for 1 min; at 40 - 55 °C for 1 min; and at 72 °C for 1 - 5 min in DNA Thermal Cycler 
(Parkin Elmer Cetus). The reaction mixture was then subjected to phenol/chloroform extraction and ethanol 
precipitation to obtain amplified DNA fragments. The ethanol precipitation was generally carried out by 
adding 2.5 volumes of ethanol and either of about 1/10 volume of 3 M sodium acetate or an equal volume 
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of 4 M ammonium acetate to the aqueous solution, mixing, centrifuging at 15,000 rpm for 15 min using a 
rotor of about 5 cm in diamet r under cooling at 4 °C to pellet the precipitates, and drying the p Met. 
Throughout the specification, the procedure "etanol precipitation" meanes the above-mentioned procedures. 
In the same manner as the above, various DNA fragments were obtained using different pair of primers in 
s PCR. 

[3] Synthesis of cDNA Using Antisense P rimer 

To 2 ul of RNA sample solution prepared in above [1] was added 1 ul of 15 pmol/ul anti-sense primer 
10 (synthesized primer AS1, AS2 or AS3), 2 ul of 10 x RT buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI). 4 ul 
of 25 mM MgCfe, 8 ul of 2.5 mM 4dNTPs, 1 ul of water and the mixture incubated at 65 °C for 5 min then 
at room temperature for 5 min. To the mixture was added 1 ul of reverse transcriptase (25 U, Life Science), 
1 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo) and the mixture incubated at 37 °C for 20 min, at 42 
°C for 30 min, and finally at 95 °C for 2 min, which was followed by an immediate cooling to 0 °C (synthesis 
15 of cDNA). 

Amplification of DNA containing specific sequences was conducted by PCR (Saiki et al., Nature 324: 
126 (1986)). Thus, 100 ul mixture containing 10 ul of cDNA solution, 10 ul of 10 x PCR buffer (100 mM 
Tris-HCI (pH 8.3), 500 mM KCI, 15 mM MgCI 2 , 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 2 ul of 15 pmol/ul 
synthetic DNA primer (the same primer as used in the synthesis of cDNA), 3 ul of 15 pmol/ul synthetic 

20 DNA primer (a counterpart of paired primers) and water was incubated at 95 °C for 5 min, then immediately 
cooled to 0 °C. One minute later, it was mixed with 0.5 ul of Taq DNA polymerase (7 U/ul, AmpliTaq™ 
Takara Shuzo) and overlaid with mineral oil. The resultant sample was then subjected to PCR. PCR was 
conducted by repeating 25 times of reaction cycle, which comprises the following treatments: at 95 °C for 1 
min; at 40 - 55 °C for 1 min; and at 72 °C for 1 - 5 min in DNA Thermal Cycler (Parkin Elmer Cetus). 

25 Finally, the reaction mixture was incubated at 72 °C for 7 min, which was followed by phenol/chloroform 
xtraction and ethanol precipitation to obtain different amplified DNA fragments derived from either of 
above-mentioned pairs of primers. 

Example 3 

30 

Cloning and Sequencing of Amplified DNA Fra gments 

Dried DNA fragment (at least 1 pmole) obtained in the above Example 2, [2] or [3] was blunt-ended with 
T4 DNA polymerase (Toyobo) and 5*-end phosphorylated with polynucleotide kinase (Toyobo) and ligated 

35 into Smal site of multi-cloning sites of 5 ng to 10 ng of pUC19 cloning vector. The cloning vector had been 
previously treated as follows: digestion with a restriction enzyme Smal (Toyobo), phenol/chloroform 
extraction, ethanol precipitation, 5 f -end dephosphorylation with alkaline phosphatase (Behringer-Manheim), 
phenol/chloroform extraction, and ethanol-precipitation. The ligated DNA was used to transform into a 
competent E.coli JM 109 or DH5 cells (Toyobo). The transformation was carried out according to the 

40 protocol of the manufacture's instruction (COMPETENT HIGH, Toyobo). Plasmid clones were recovered 
from transformed cells conventionally. At least 20 transformants were obtained using pUCl9 cloning vectors 
containing either of DNA fragments obtained in the above Example 2, [2] and [3] using each pair of primers. 

The determination of base sequence of DNA fragment was conducted by Fluorescent DNA Sequencer 
(GENESIS 2000, Dupont) using, as sequence primer, the following synthetic primers: 

45 5* d(GTAAAACGACGGCCAGT)3* (SEQ ID NO 143) and 

5!d(CAGGAAACAGCTATGAC)3' (SEQ ID NO 144) for the + and - strands of DNA fragment to be 
sequenced. 

Base sequences of clones is given in SEQ ID NO 1 to 4 and 9 to 12. Base sequences of SEQ ID NO 1, 
2, 3, 4, 9, 10. 11 and 12 correspond to that of + strand of clones N1-1, N2-1, N3-1, N10-1, N1-2, S1-1, S1- 

50 2 and S1-3 of transformants, respectively. These clones are double stranded DNA which were prepared in 
the same manner as those described in Examples 2 and 3 using 4 kinds of pairs of primers shown in 
Example 2, [2]. Plasmid used for sequencing the clones were designated as pUCN1-1, pUCN2-1, pUCN3-1, 
pUCN10-1 t pUCN1-2, pUCSM, pUCS1-2 and pUCS1-3. respectively. Each plasmid contained one DNA 
molecule corresponding to each DNA fragment. 

55 These base sequences represents bas s quenc s of clones obtained by cloning the cDNA syn- 
thesized from RNA isolated from serum of pati nt(s) suffering from HC. Therefore, these sequences are 
specific for clones originated from serum of HCV-infected patients but can not be found or obtained from 
serum of healthy subjects. Thus, cDNA prepared from RNA (if there are any) obtained from a healthy 
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subject under more strict conditions, for instance, by increasing (3 or 4 folds) the reaction cycles of PCR in 
Example 2, [2] and [3], by repeating them 60 - 100 times, did not show any homology in base s quence 
with those shown in SEQ ID NO 1 to 4. Consequently, base sequences of clones N1-1, N2-1, N3-1, N10-1, 
N1-2, S1-1, S1-2 and S1-3 are specific for those obtained from serum of patients suffering from HC. 
5 As the next step, the resultant DNA fragment was modified so that a polypeptide encoded by a open 
reading frame should be expressed in a host cell transformed by the modified DNA, and the resultant 
product was then evaluated as to the ability to react, as a antigenic polypeptide of HCV, with HCAb in 
serum of HC patients. 

10 Example 4 

Preparation of Clone N1N3N10 or N3N10 

[1] Preparation of Clone N3N10 

75 

One ul of each DNA fragments (about 200 - 300 ng) from clones N3-1 and N10-1 was added into a 
reaction mixture containing 10 ul of 10 x PCR buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI, 15 mM 
MgCI 2 , 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 ul each of 20 pmol/ul synthetic primers S2 and AS3. and 
76.5 ul of water. After an intimate mixing, the mixture was heated at 95 °C for 5 min, then immediately 

20 cooled to 0 °C. One minute later, to the mixture was added 0.5 ul of Taq DNA polymerase (7 U/ul, 
AmpliTaq™ Takara Shuzo), mixed and overlaid with mineral oil. The sample was then subjected to PCR. 
PCR was conducted by repeating 25 times of reaction cycle, which comprises the following treatments: at 
95 °C for 1 min; at 40 °C for 1 min; and at 72 °C for 2 min in DNA Thermal Cycler (Parkin Elmer Cetus). It 
was followed by an incubation at 97 °C for 2 min. The mixture was immediately cooled to 0 °C, kept at 0 °C 

25 for 2 min, mixed with 0.5 ul of Taq DNA polymerase (7 U/ui, AmpliTaq™ Takara Shuzo). The sample was 
then treated in the same manner as the above by repeating 25 times of reaction cycle, which comprises the 
following treatments: at 95 °C for 1 min; at 50 °C for 1 min; and at 72 °C for 2 min. After the final treatment 
at 72 °C for 7 min, the resultant reaction solution was treated with phenol/chloroform then precipitated with 
ethanol. The amplified DNA samples were fractionated on agarose gel electrophoresis and a gel containing 

30 a desired fragment having an expected length was removed (Molecular Cloning (1982) Cold Spring Harbor) 
to isolate the DNA fragment therefrom conventionally. The resultant DNA fragment was then modified as 
described in Example 3 and ligated into Smal site of multi-cloning sites of pUCl9, cloned and screened as 
described in Example 3 to obtain plasmid pUCN3N10. The resultant cDNA derived from serum of HC 
patient was referred to as clone N3N10 whose base sequence is given in SEQ ID NO 5. 

35 

[2] Preparation of Clone N1N3N10 

Two overlapping clones N1-1 and N3N10 were ligated by taking advantage of unique restriction site 
which exists in the overlapping regions of the both clones. Upon digestion with restriction enzyme BssH11, 

40 clone N1-1 is cleaved at the 3' site of a nucleotide No. 455 G and clone N3N10 at the 3* site of a nucleotide 
No.159 G. The ligation of two clones N1-1 and N3N10 was accomplished on the basis of an assumption 
that plasmids pUCN1 and pUCN3N10 contain each clone in the same orientation. Thus, plasmid pUCN1 
was digested with Hindlll and BssHII to yield a 492 bp DNA fragment comprising a Hindlll-Smai DNA 
fragment of plasmid pUC19 attached to the 5' end of the No. 455 bp nucleotide of clone Npl derived from 

45 serum of HC patient, which fragment was then exchanged with 159 bp Hindlll - BssHII fragment of Plasmid 
pUCN3N10. cloned and screened to obtain a plasmid pUCN1N3N10. The~plasmidpUCN1N3N10 contained 
the desired clone N1N3N10 comprising clones N1-1, N3-1 and N10-1 ligated without overlapping. The base 
sequence of clone N1N3N10 is shown in SEQ ID NO 6. 

so Example 5 



Modification of DNA for the Expression of HCV Polypeptide Encoded by Clones N3-1 or N3N 10 
[1] Modification of DNA for the Expression of HCV Polypeptide Encoded by Clone N3-1 in E.coli 

55 

Clone N3-1 contains a DNA fragment capable of encoding a structural protein of HCV which begins at 
nucleotide No. 22 (A). The DNA can be expressed utilizing ATG codon at nucleotides Nos. 22 to 24. The 
modification of DNA was carried out using PCR. The following synthetic oligonucleotide primers were used. 
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5' primer: 

5* G C AAG CTTATG AG C ACAAATCC AAAACCCC AAAG A 3' (SEQ ID NO 145) 
3' primer: 

5' GCGAATTCAGATCTTCACCTACGCCGGGGGTCCGTGGG 3' (SEQ ID NO 146) 

5 The synthetic DNA was adjusted to 20 pmol/ml before use. 

PCR was carried out in the same manner as described in the above according to Saiki's method in a 
total volume of 100 ul containing 100 ng of plasmid pUCN3, as a template, and 2 ul each of 3' and 5' 
primers. The reaction mixture was heated at 95 °C for 5 min and quenched at 0 °C. One minute later, to the 
mixture was added 0.5 ul of Taq DNA polymerase (7 U/ml, AmpliTaq™ Takara Shuzo), mixed thoroughly 

w and overlaid with mineral oil. The sample was reacted by repeating 25 cycles of treatments which 
comprises: at 95 °C for 1 minute; at 60 °C for 1 min; and at 72 °C for 5 min in DNA Thermal Cycler (Parkin 
Elmer Cetus). The resultant reaction solution was extracted with phenol/chloroform, and precipitated with 
ethanol conventionally. The amplified DNA samples were digested with Hindlll and EcoRI, and fractionated 
on acrylamide gel electrophoresis and extracted (Molecular Cloning, ColcTSpring Harbor" (1982)). 

75 The resultant DNA fragment was then ligated into Hindlll and EcoRI sites of a cloning vector pUC19, 
cloned and screened to obtain plasmid pUCHN3. The resultant plasmid was sequenced and shown in SEQ 
ID NO 7. The sequence shows that it contains, at the 5 , -terminus, a Hindlll site followed by ATG initiation 
codon, and at the SMerminus, a termination codon TGA, Bglll and EcoRTrestriction sites, from 5' to 3\ 

20 [2] Modification of DNA for the Expression of HCV Polypeptide Encoded by Clone N3N10 in E.coli 

Clone N3N10 contains a DNA fragment capable of encoding structural protein of HCV which begins at 
nucleotide No. 22 (A). The DNA can be expressed utilizing ATG codon at nucleotides Nos. 22 to 24. The 
modification of DNA was carried out using PCR. The following synthetic oligonucleotide primers were used. 
25 5 f primer: 

5' GCAAGCTTATGAGCACAAATCCAAAACCCCAAAGA 3' (SEQ ID NO 145) 
3' primer: 

S 1 GCGAATTCAGATCTTCAGATTCTCTGAGACGGCCCTCGT 3' (SEQ ID NO 147) 
The synthetic DNA was adjusted to 20 pmol/ml before use. 
30 PCR was carried out in the same manner as the above [1] except that the above two primers and 
plasmid pUCN3N10, as a template, were used and PCR was conducted by repeating 10 cycles of 
treatments which comprises: at 95 °Cfor 1 minute; at 50 °C for 1 min; and at 72 °C for 5 min, and then 20 
cycles of treatments which comprises: at 95 °Cfor 1 minute; at 65 °C for 1 min; and at 72 °C for 5 min. 
The amplified DNA sample was digested with Hindlll and EcoRI, and fractionated on acrylamide gel 
35 electrophoresis and extracted the gel containing a DNA fragmenfofdesired length (Molecular Cloning, Cold 
Spring Harbor (1982)). The resultant DNA fragments were then ligated into Hindlll and EcoRI sites of 
cloning vector pUC19, cloned and screened conventionally to obtain plasmid pUCHN3N10TThe plasmid 
pUCHN3N10 was then sequenced. 

Thus obtained clone HN3N10 contains, at the ^-terminus, a Hindlll site followed by ATG initiation 
40 codon, and at the 3'-terminus, a termination codon TGA, Bglll and EcoRI restriction sites, from 5' to 3'. 

For the removal or Bam HI site from the clone HN3N10, a "nucleotide sequence: 5'GGATCC3' was 
converted to 5*GGATAC3 f by PCR using the following synthetic DNA fragments as primers. 
5' primer: 

5' GCTACTCCGGATACCAC 3' (SEQ ID NO 148) 
45 3' primer: 

5" GTAAAACGACGGCCAGT 3* (SEQ ID NO 143) 

The synthetic DNA was adjusted to 20 pmol/ml before use. 

The nucleotide "G" at the ^-terminus of 5* primer corresponds to the No.1016 G of the base sequence 
of clone N3N10. The 3' primer is derived from plasmid pUCl9 and the same as one of primers used for 

so sequencing in Example 3. The PCR was conducted by repeating 25 cycles of treatments which comprises: 
at 95 °C for 1 minute; at 55 °C for 1 min; and at 72 °C for 1 min. For the reaction, 3 ul of each primer and 
100 ng of plasmid pUCHN3N10, as a template DNA, were used. The reaction mixture was then subjected to 
phenol/chloroform extraction and ethanol precipitation as conventionally. Th amplified DNA sample was 
digested with Mrol, Bglll, and Bam HI, fractionated on acrylamide gel electrophoresis, and extracted the gel 

55 containing a desired 226 bp DNA fragment (Molecular Cloning, Cold Spring Harbor (1982)). The resultant 
DNA fragments were then ligated into Mrol and Bglll sites of plasmid pUCHN3N10, cloned and screened by 
conventional method to obtain plasmid pUCHN3N10AB. The resultant plasmid pUCHN3N10AB was then 
sequenced and base sequence of clone HN3N10AB is shown in SEQ ID NO 8. 
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[3] Modification of DNA for the Expression of HCV Polypeptide Encoded by Clone N3N10 in Insect Cells 

Clone N3N10 appears to contain entire viral protein-encoding genes including those encoding core, 
nvelope (M-gp35) proteins. The region beginning at nuci otide No. 22 (A) which ncodes structural protein 
s was expressed in insect cells utilizing ATG codon at nucleotides Nos. 22 to 24. When insect cells were 
transfected with the DNA and cultivated, core andenvelope (M-gp35) proteins were expressed in the fused 
form as a precursor polypeptide, which was then processed to separate core and envelope (M-gp35). At 
least the latter envelope (M-gp35) was then glycosylated incompletely and accumulated intracellurarly. The 
modification of DNA of clone N3N10 for the construction of expression vector was carried out by PCR using 
io following synthetic oligonucleotide primers. 
5' primers: 

MS106: 5' GCGTCGACGCTAGCATGAGCACAAATCCAAAACCC 3' (SEQ ID NO 149) 
MS107: 5* GCGTCGACGCTAGCAGGTCTCGTAGACCGTGCATC 3' (SEQ ID NO 150) 
3' primer: 

75 MS108: 5* GCGAATTCGCTAGCTCAGGATTCTCTGAGACGGCCCTCGA 3' (SEQ ID NO 151) 
These three synthetic DNAs were separately adjusted to 20 pmol/ml before use. 
The PCR was carried out using the same reaction solution and worked up in the same manner as 
described in the above [1] except that plasmid pUCN3N10 was used as a template plasmid, and, as 
S'primer, primer MS106 or MS107 and, as 3'primer, MS108 were used. PCR was accomplished by 
20 repeating 10 times of reaction cycles consisting of: 1 min at 95 °C; 1 min at 50 °C and 5 min at 72 °C ; and 
then 20 times of reaction cycles consisting of: 1 min at 95 °C; 1 min at 65 °C; 5 min at 72 °C. A 
combination of primers MS106 and MS108 gave a desired 1265 bp DNA fragment 106-108 and that of 
primers MS107 and MS108 gave a desired 1286 bp DNA fragment 107-108. 

These DNA fragments were digested with Nhel, fractionated on acrylamide gel electrophoresis and 
25 extracted by convenional means (Molecular CloningTCold Spring Harbor (1982)) to obtain DNA fragments of 
desired length. Each of the resultant DNA fragments was then ligated into Nhel restriction site of a transfer 
vector pBlueBac (Invitrogen), cloned and screened by the usual method to yield plasmids pBlueN3N10-1 
and pBlueN3N10-2, which are derived from DNA fragments 106-108 and 107-108, respectively. 

Plasmids pBlueN3N10-1 and pBlueN3N10-2 were digested with Nhel or BamHI completely to confirm 
30 that each plasmid contains only one DNA fragment, either of 106-108 or 107T08 inserted at Nhel site. 
Furthermore, taking account of the instruction provided by the manufacture (Invitrogen), the expression unit 
of these plasmid contain a gene encoding HCV structural polypeptide (core and envelope) oriented forward 
and ligated to the Nhel cloning site down stream from a polyhedrin promoter. 



35 Example 6 

Expression of HCV Polypeptides Encoded by Clones HN3, HN3N10AB 
[1] Expression of Polypeptide Encoded by Clone HN3 in E.coli 

40 

Clone HN3 encodes a part of polypeptide encoded by cDNA originated from serum of HC patient. The 
polypeptide encoded by clone HN3 was expressed directly in E.coli, as it is, by subcloning said clone into 
an expression vector pCZ44 (Japanese Patent Publication No. 124387/1989). 

Clone HN3 was digested thoroughly with restriction enzymes Hindill and Bglll, extracted with 
45 phenol/chloroform, precipitated with ethanol, separated on acrylamide gefeiectrophoreiii! From the gel was 
extracted a DNA fragment having cohesive Hindill- and Bglll-restricted ends (Molecular Cloning, Cold 
Spring Harbor, 1982). The expression vector pCZ44 was digested with Hindill and Bglll. The larger DNA 
fragment containing a region functional for the replication in E.coli was^eparatedTlreated in the same 
manner, ligated to the Hindlll-Bglll fragment of clone HN3 so as to have only one insertion, and cloned by 
so conventional method to yield plasmid pCZCORE. 

Alternatively, an expression vector was constructed using an expression vector pGEX-2T (Pharmacia) 
for the expression of a fused protein of a desired polypeptide and 0-glutatnione-S-transferase (GST). The 
construction was carried out substantial in accordance with the protocol taught by the manufacture 
(Pharmacia). Thus, the expression vector pGEX-2T was digested with BamHI. The linearized vector was 
55 ligated with a Hindill linker to obtain a DNA fragment having EcoRI and Hindill restriction sites at the 3'- and 
S'-termini. The fragment was ligated to Hindlll-EcoRI fragment of HN3*such that every reading frame of 
codon is consistent with an amino acid of "clone N^l to yield an expression vector pGEXCORE. 

E.coli K12 strains (e.g., JM109, KS476) or those derived from B strains transformed with plasmid 
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pCZCORE was grown in L-Broth at 37 °C overnight (Molecular Cloning, Cold Spring Harbor, 1982). The 
cultured broth was diluted 50-folds by inoculating it into a freshly prepared L-Broth and the cultivation 
continued with shaking at 30 °C for 2 hr. At this time, IPTG (isopropyl-0-D-galactopyranoside) was added to 
the culture to a final concentration of 1 or 2 mM in order to induce the expression of DNA encoding HCV- 

5 originated CORE-N3 polypeptide by single-clone-derived transformants (E.coli cells transformed solely by 
plasmid pCZCORE derived from clone HN3). Base sequence and deduced amino acid sequence of clone 
HN3 is shown in SEQ ID NO 7. 

As mentioned in the above, plasmid pGEXCORE can be used to obtain transformants capable of 
expressing a fused protein include desired polypeptide and GST. The plasmid encodes a fused protein 

w GST-CORE comprising GST, which has a thrombin-cleaving site at its C-terminus, and a polypeptide 
derived from a clone HN3, the same polypeptide as that encoded by plasmid pCZCORE. The transformants 
containing pGEXCORE were grown in the presence of IPTG using the same protocol as that used for the 
expression of CORE-N3 polypeptide of HCV from transformants harboring pCZCORE to produce the fused 
polypeptide GST-CORE. 

75 

[2] Expression of Polypeptides Encoded by Clone HN3N1QAB 

Clone HN3N10AB encoding a part of polypeptide encoded by cDNA originated from serum of HC 
patient was expressed in E.coli to give polypeptide CME-N3N10AB in the same manner as the above [1]. 

20 The cDNA used was that contained in clone HN3N10AB obtained from serum of HC patient, which clone 
had been previously isolated and sequenced as described in Examples 3, Example 4 [1], and Example 5 
[2]. The expression plasmid pCZCMEAB was constructed by subcloniong a DNA fragment isolated from 
plasmid pUCHN3N10AB by ligating its Hindlll and Bglll cohesive ends to Hindlll and Bglll sites of plasmid 
pCZ44 such that only one DNA fragment should be inserted in an appropriate orie~ntation by the same 

25 method used for the preparation of plasmid pCZCORE. Plasmid pCZCMEAB was then subjected to the 
sequencing and restriction enzyme mapping to confirm that an expression unit of plasmid pCZCMEAB was 
reconstructed properly. 

The cultivation of transformants was carried out in the presence of IPTG in order to induce the 
expression of HCV-originated CME-N3N10AB polypeptide by single-clone-derived transformants (E.coli JM 
30 109 cells transformed solely by plasmid pCZCMEAB derived from clone HN3N10AB, a variant of clone 
N3N10). Base sequence and deduced amino acid sequence of cDNA obtained from serum of HC patient 
contained in clone HN3N10AB is shown in SEQ ID NO 8. The amino acid sequences deduced from base 
sequences of a clone HN3N10AB and its original clone N3N10 were exactly the same. 

In the same manner as the above [1], plasmid pGEXCMEAB was constructed, transformed into host 
35 cells. The transformants, when grown under a same condition for transformants harboring plasmid 
pCZCMEAB inducing by IPTG, expressed a fused protein GST-CME-N3N10AB. 

[3] Expression of Polypeptide Encoded by Clone N3N10 in Insect Cells 



40 The expression of structural polypeptide (core, envelope (M-gp35) of HCV encoded by plasmid 
pBlueN3N10-1 prepared in Example 5 [3] was conducted substantial in accordance with a known expres- 
sion manual for baculovirus (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4, 
hereinafter, referred to as Maxbac, Invitrogen). 

Plasmids pBlueN3N10-1 and pBlueN3N10-2, plasmids prepared by inserting DNA fragment containing 

45 HCV structural gene at the Nhel site of a transfer vector pBlueBac (Maxbac, pp.37), were recovered from 
E.coli host cells transformed thereby, and purified according to the method of Maniatis et al.(Molecular 
Cloning, Cold Spring Harbor Laboratory, pp.86 - 96 (1982)). Thus, a large amount of HCV structural gene- 
containing transfer plasmid DNA was obtained. Sf9 cells were co-transfected with 2 ug of either of plasmids 
pBlueN3N10-1 or pBlueN3N10-2 and 1 ug of AcNPV viral DNA (Maxbac, pp.27). Sf9 cells were grown in 

so TMN-FH medium (Invitrogen) containing 10% FCS (fetal calf serum) in a 6 cm dish (60 x 15 mm, 
FALCON R ; Nippon Becton Dickinson Co., Ltd.) until a ceil density reached to about 2 x 10 6 /plate. The TMN- 
F medium was removed and a 0.75 ml Grace medium (Gibco) containing 10% FCS was added thereto. To 
the DNA mixture described in th above was added 0.75 ml of transfection buffer (attached to the kit) was 
thoroughly mixed by vortex and gradually added dropwise onto the Grace medium. After the culture being 

55 allowed to stand for 4 hr at 27 °C, Grace m dium was replaced with 3 ml of TMN-FH medium containing 
10% FCS and the dish incubated at 27 °C for 6 days. Three days from the incubation, there observed a few 
multinucleate cells and on sixth day, almost all the cells were multinuclear. The supernatant was taken into 
a centrifuging tube and centrifuged at 1,000 rpm, 10 min to obtain the supernatant as a cotransfected viral 
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solution. 

The cotransfected viral solution contains about 10 8 viruses/ml and 0.5% of which were recombinant 
viruses. The isolation of recombinant virus was carried out by a plaque isolation method described below. 

Thus, cells were adsorbed onto a 6 cm dish by se ding 1.5 x 10 6 cells on medium and removing the 
s medium completely. To the dish was added 100 ul of a diluted viral solution (10~* and 10 -5 folds), 
separately and incubated at room temperature for 1 hr while slanting the 6 cm dish every 15 min to spread 
the virus extensively. X-gal medium containing agarose was prepared by adding 5-bromo-4-chloro-3-indolyl- 
0-D-galactoside to a final concentration of 150 ug/l (Maxbac, pp. 16-17) to a warm medium which had been 
prepared by autoclaving 2.5% baculovirus agarose (Invitrogen) at 105 °C for 10 min, mixing with TMN-FH 
w medium containing 10% FCS preheated at 46 °C at the mixing ratio of 1 : 3, and keeping the temperature 
at 46 °C. 

After the completion of infection, virus solution was aspirated thoroughly from the 6 cm dish and 4 ml of 
the warm X-gal medium containing agarose (previously prepared) was gently added to every 6 cm dish not 
to peel off cells. The dish kept open by slightly sliding a lid until the agarose solidified and dried, and 

75 thereafter the dish covered, turned upside down, and incubated at 27 °C for 6 days. The plaques were 
observed under a phase difference microscope to find blue plaques which do not form multinucleate cells. 
Agarose containing blue recombinant plaques were removed with a Pasteur pipet and suspended into 1 ml 
of TMN-FH medium by pipetting many times. The above process which comprised: infection, 6-day 
incubation, and isolation of virus containing transfer plasmid DNA is called the "plaque method". The 

20 plaque method was repeated using 100 ul of viral suspension. After repeating said process three times, 
there obtained a recombinant virus having a gene encoding structural protein derived from HCV free from 
contamination with that of wild-type strain. 

A viral solution of the primary recombinant virus was prepared by aspirating plaques with a Pasteur 
pipet, and mixing thoroughly with 1 ml of TMN-FH medium. Because the primary viral solution was low in 

25 virus density for infection, it required further treatments for concentration. Thus, 100 ul of viral solution was 
adsorbed onto Sf9 cells grown in 6 cm dish to a semi-confluent, and 4 ml of TMN-FH medium was added 
thereto and incubated three days. The culture supernatant was recovered to yield a recombinant viral 
solution for infection. 

For the production of HCV structural protein, a suspension of Sf9 cells in TMN-FH medium containing 
30 10% FCS (5x10* cells/10 ml medium) was added into a 9 cm dish and kept 1 hr for adsorption. After the 
removal of medium, 250 ul of recombinant viral solution was added to the 9 cm dish and spread 
extensively. To the dish was added 10 ml TMN-FH medium containing 10% FCS and incubated at 27 °C for 
4 days. The cells expressing recombinant glycoprotein of HCV were harvested by scraping up and 
suspended into 1,000 ml of phosphate buffered saline. 
35 Thus, HCV structural gene was expressed in Sf9 cells transfected with said virus. The transformants 
transformed with plasmids pBlueN3N10-1 and pBlueN3N10-2 expressed the same HCV polypeptide. 

Example 7 



40 Identification of Expression Products as HCAg 

The expression products obtained in Example 6, which are CORE-N3 and CME-N3N10AB polypep- 
tides, and HCV polypeptide encoded by clone N3N10 expressed in insect cells, were immunologically 
reactive with antiserum obtained from HC patients, demonstrating that these expression products are HC 
45 associated antigens. 

Identification of these expression products as HCAg were carried out by Western blot as follows. E. coli 
cells transformed with either of plsmids pCZCORE and pCZCMEAB encoding CORE-N3 and CME- 
N3N10AB polypeptides, respectively were grown under the presence of IPTG for 3 hr or a overnight in the 
same manner as described in Example 6. 

so Recombinant strains were harvested by centrifuging 1,000 ul of the cultured broth at 6,500 rpm, 10 min. 
The pellet was dissolved into a sample solution (50 mM Tris-HCi, pH6.8 containing 2% SDS, 5% 
mercaptoethanol, 10% glycerin, and 0.005% bromophenol blue) for SDS-polyacrylamide gel electrophoresis 
to a final volume of 0.2 ml. Sf9 cells infected with viruses which had been treat d more than 3 tim s by 
plaque method were collected by scraping up and suspended into 1 ,000 ml of phosphate-buffered saline 

55 physiological saline (PBS) and 100 ul of the suspension was centrifuged at 6,500 rpm, 10 min to pellet th 
cells. The pellet was dissolved into a sample solution for SDS-polyacrylamide gel electrophoresis to a final 
volume of 0.2 ml. 

The sample solutions were then boiled at 100 °C for 10 min. Ten ul of the boiled solution was loaded 
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onto 0.1% SDS-15% polyacrylamide ge! (70 x 85 x 1 mm) together with a marker protein LMW Kit E (low- 
molecular weight marker prot in, Pharmacia). Electrophoresis was carried out at a constant current of 30 
mA for 45 min in Tris buffer (25 mM Tris, pH 8.3, 192 mM glycine, 0.1% SDS) as electrode buffer. 
Thereafter, DNA was transferr d electophoretically to a nitrocellulose filter by superposing the gel onto a 

5 filter BA-83 (S & S), impressing a constant current of 120 mA for about 20 min between gel (cathode) and 
the filter (anode) as conventionally. 

The transcribed filter was cut to remove a part containing a marker protein (referred to as marker filter) 
and that containing the sample (referred to as sample filter). The former was stained with 0.1% (w/v) 
amideblack 10B and the latter immersed into 0.01 M PBS (pH 7.4) containing 5% (w/v) bovine serum 

iq albumin (BSA). Serum from a HC patient was diluted 50 times with 0.01 M PBS (pH 7.4) containing 5% 
(w/v) BSA. To the sample filter was added 10 ul of diluted serum and the filter allowed to stand for 2 hr at 
room temperature. Thereafter, the filter was washed with PBS containing 0.1% (v/v) Tween 20 for 20 min 
(x3). 

The sample filter was then reacted with 10 ml of horseradish peroxidase conjugated anti human IgG 
75 (Gappel) at 37 °C for 1 hr and washed with PBS containing 0.1% (v/v) Tween 20 for 20 min (x3). The filter 
was then immersed into peroxidase-color-producing solution (60 mg 4-chloro-1-naphthol, 20 ml methanol, 
80 ml PBS, and 20 ul aqueous hydrogen peroxide). The colored filter was washed with distilled water and 
compared with the marker filter, demonstrating that polypeptides CORE-N3 and CME-N3N10AB contain 
only one colored protein having a reasonable molecular weight as an expression product of cDNAs 
20 originated from serum of HC patients and contained in plasmid pCZCORE and pCZCMEAB, respectively. 

Cells transformed with pBlueN3N10-1 or plasmid pBlueN3N10-2, both of which encode polypeptide 
encoded by clone N3N10. expressed HCV polypeptides showing the same pattern on the detection. A 
protein of molecular weight of about 22 kD was expressed which corresponds to calculated molecular 
weight of an expression product from core-encoding gene contained in clone N3N10. Thus, said core- 
25 encoding gene, when expressed, gives a protein of calculated molecular weight of about 22 kD (without 
modification). As the result, the expressed product was identified as hepatitis C associated antigenic 
polypeptide presumably derived from HCV core protein. 

Example 8 

30 

Comparison of Clones Obtained in Example 2 [2] and [3] 

Three clones corresponding to SEQ ID NO 1 were separately cloned using serum from a HC patient 
according to the method described in Example 2 [2] (using random primers) and sequenced. On the other 
35 hand, three clones corresponding to SEQ ID NO 1 were separately cloned using serum from the same HC 
patient according to the method described in Example 2 [3] (using antisense primers) and sequenced. 

Clones obtained using random primers had the same base sequence as that shown by SEQ ID NO 1 , 
whereas the synthetic primers S1 and AS1 were used, two of three clones obtained independently had the 
base sequence of SEQ ID NO 1, and one clone had a base sequence which differed from that of SEQ ID 
40 NO 1 as to three nucleotides. Thus, at No. 345, A was changed to C, No.322 A changed to T, and No. 95 A 
changed to C. These differences indicate that a patient is infected at least 2 kinds of viruses. 

The above facts demonstrate that there are no substantial difference between clones obtained by 
methods in Example 2 [2] and those obtained in Example 2 [3]. 

45 Example 9 

Synthesis of cDNA 

[1] Preparation of RNA Sample Solution 

50 

RNA sample solution was prepared by resolving the dried nucleic acid obtained in Example 1 in 30 ul 
of water containing 10 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo, Japan). 

[2] Synthesis of cDNA Using Antisense Primer 

55 

To 2 ul of RNA sample solution prepared in above [1] was added 1 ul of 15 pmol/ul anti-sense primer 
(synthesized primer MS122, MSI 57 or MS148), 2 ul of 10 x RT buffer (100 mM Tris-HCI (pH8.3), 500 mM 
KCI), 4 ul of 25 mM MgCI 2( 8 ul of 2.5 mM 4dNTPs, 1 ul of water and the mixture incubated at 65 °C for 5 
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min then at room temperature for 5 min. To the mixture was added 1 ul of reverse transcriptase (25 U, Life 
Science), 1 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo) and the mixture incubated at 37 °C for 20 
min, at 42 °C for 30 min, and finally at 95 °C for 2 min, which was followed by an immediate cooling to 0 °C 
(synthesis of cDNA). 

Amplification of DNA containing specific sequences was conducted by PCR (Saiki et al., Nature 324: 
126 (1986)). Thus, 100 u,l mixture containing ten ul of cDNA solution, 10 ul of 10 x PCR buffer (100 mM 
Tris-HCI (pH8.3). 500 mM KCI, 15 mM MgCfe, 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 2 ul of 150 pmol/ul 
synthetic DNA primer (the same primer as used in the synthesis of cDNA), 3 ul of 15 pmol/ul synthetic 
DNA primer (a counterpart of pair of primers, i.e.,MS122-MS123, MS157-MS156, or MS148-MS146) and 
water was incubated at 95 °C for 5 min, then immediately cooled to 0 °C. One minute later, it was mixed 
with 0.5 ul of Taq DNA polymerase (7 U/ul. AmpliTaq™ Takara Shuzo) and overlaid with mineral oil. The 
resultant sample was then subjected to PCR. PCR was conducted by repeating 25 times of reaction cycle, 
which comprises the following treatments: at 95 °C for 1 min; at 40 - 55 °C for 1 min; and at 72 °C for 1 - 5 
min in DNA Thermal Cycler (Parkin Elmer Cetus). Finally, the reaction mixture was incubated at 72 °C for 7 
min, which was followed by phenol/chloroform extraction and ethanol precipitation to obtain different 
amplified DNA fragments derived from either of above-mentioned pairs of primers. 

Example 10 

Cloning and Sequencing of Amplified DNA Fragments 



Dried DNA fragment (at least 1 pmole) obtained in the above Example 9, [2] was blunt-ended with T4 
DNA polymerase (Toyobo) and 5'-end phosphorylated with polynucleotide kinase (Toyobo) and ligated into 
Smal site of multi-cloning sites of 5 ng to 10 ng of pUC19 cloning vector. The cloning vector had been 
previously treated as follows: digestion with a restriction enzyme Smal (Toyobo), phenol/chloroform 
extraction, ethanol precipitation, 5 f -end dephosphorylation with alkaline phosphatase (Behringer-Manheim) 
(Molecular Cloning (1982) Cold Spring Harbor), phenol/chloroform extraction, and ethanol-precipitation. The 
ligated DNA was used to transform a competent E.coli JM 109 or DH5 cells (Toyobo). The transformation 
was carried out according to the protocol of the manufacture's instruction (COMPETENT HIGH, Toyobo). 
Plasmid clones were recovered from transformed cells conventionally. At least 20 transformants were 
obtained using pUC19 cloning vectors containing either of DNA fragments obtained using either of pairs of 
primers in the same manner as that described in Example 9, [2]. 

Plasmid DNA was isolated from corresponding transformant by an usual method and sequenced. The 
determination of base sequence was conducted by means of Fluorescent DNA Sequencer (GENESIS 2000, 
Dupont) using, as sequence primer, the following synthetic primers: 
5' d(GTAAAACGACGGCCAGT)3* (SEQ ID NO 143) and 

5'd(CAGGAAACAGCTATGAC)3' (SEQ ID NO 144) for the + and - strands of DNA fragment to be 
sequenced. 

Base sequences of DNA fragments are given in SEQ ID NO 13 to 27, which show the base sequences 
of + strand of HCV genes inserted into each plasmid used for the transformation. These clones are double 
stranded DNA. Plasmids used for the sequencing of clones N19-1, N19-2 and N19-3 were designated as 
plasmids pUCN19-1, pUCN19-2 and pUCN19-3, respectively. Each plasmid contained one DNA molecule 
corresponding to each DNA fragment. In the same manner, a plasmid which contains a single clone and is 
used for the sequencing of the same is designated by adding a prefix fl pUC n to the name of the clone. 

These base sequences represents base sequences of clones obtained by cloning the cDNA syn- 
thesized from RNA isolated from serum of patient(s) suffering from HC. Therefore, these sequences are 
specific for clones originated from serum of HCV-infected patients but can not be found or obtained from 
serum of healthy subjects. Thus, cDNA prepared from RNA (if there are any) obtained from a healthy 
subject under more strict conditions, for instance, by increasing (3 or 4 folds) the reaction cycles of PCR in 
Example 9 [2] and [3], by repeating them 60 - 100 times, did not show any homology in base sequence 
with those shown in SEQ ID NO 13 to 27. Consequently, base sequences of clones shown in SEQ ID NO 
13 to 27 are specific for those obtained from serum of HC patient. 

The base sequences of DNA fragments were compared with a known base sequence of HCV gene. As 
can be seen from the fact that three clones N19-1, N19-2 and N-193 were obtained from serum of one HC 
patient in Example 9 [2] using primers MS122 and MS123, there must be mor than one virus in a pati nt. 

Example 1 1 
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Preparation of Clones N27MX24A-1 and N27MX 24B-1 

[1] Preparation of Clones N19MX24A-1 and N19MX24B-1 

One ul (about o.5 to 1 ug/ul) of each DNA fragment from clones N19-1 and MX24-4 was added into a 
reaction mixture containing 10 ul of 10 x PCR buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI, 15 mM 
MgCI 2 . 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 ul each of 20 pmol/ul synthetic primers S2 and AS3, and 
76.5 ul of water. After an intimate mixing, the mixture was heated at 95 °C for 5 min, then immediately 
cooled to 0 °C. One minute later, to the mixture was added 0.5 ul of Taq DNA polymerase (7 U/ul, 
AmpliTaq™ Takara Shuzo), mixed and overlaid with mineral oil. The sample was then subjected to PCR. 
PCR was conducted by repeating 25 times of reaction cycle, which comprises the following treatments: at 
95 °C for 1 min; at 40 °C for 1 min; and at 72 °C for 2 min in DNA Thermal Cycler (Parkin Elmer Cetus). It 
was followed by an incubation at 97 °C for 2 min. The mixture was immediately cooled to 0 °C, kept at 0 °C 
for 2 min, mixed with 0.5 ul of Taq DNA polymerase (7 U/ul, AmpliTaq™ Takara Shuzo). The sample was 
then treated in the same manner as the above by repeating 25 times of reaction cycle, which comprises the 
following treatments: at 95°C for 1 min; at 50 °C for 1 min; and at 72 °C for 2 min. After the final treatment 
at 72 °C for 7 min, the resultant reaction solution was treated with phenol/chloroform then precipitated with 
ethanol. The amplified DNA samples were fractionated on agarose gel electrophoresis and a gel containing 
a fragment having a desired length was removed (Molecular Cloning (1982) Cold Spring Harbor) to isolate 
the DNA fragment therefrom conventionally. The resultant DNA fragment was then modified as described in 
Example 10 and ligated into Sma l site of multi-cloning sites of pUC19, cloned and screened as described in 
Example 10 to obtain plasmids pUCN19MX24A-1 and pUCN19MX24B-1. The resultant cDNAs derived from 
serum of HC patient were referred to as clones N19MX24A-1 and N19MX24B-1, of which base sequences 
are given in SEQ ID NO 29 and 30. 

[2] Preparation of Clone N27N19-1 



Two overlapping clones N27-3 and N19-1 were ligated by taking advantage of unique restriction site 
which exists in the overlapping regions of the both clones. Upon digestion with restriction enzyme Mlul, 
clone N27-3 is cleaved at the 3' site of a nucleotide No. 330 (A) and clone N19-1 at the 3 f site~oTa 
nucleotide No.51 (A). The ligation of clones N27-3 and N19-1 was accomplished on the basis of an 
assumption that plasmids pUCN27-3 and pUCN19-1 contain each DNA fragment in the same orientation. 
Thus, plasmid pUCN27-3 was digested with Hindlll and Mlul to isolate a DNA fragment containing 5' region 
of clone N27-3 which comprises a Hindlll -Sma T DNA fra"gment of plasmid pUC19 attached to the 5' end of 
the clone N27-3, a cDNA derived from serum of HC patient. The DNA fragment was then exchanged with a 
Hindill-Mlul fragment of clone N19-1 containing 3' region of said clone, cloned and screened to obtain a 
plasmid pUCN27N19-1. The plasmid pUCN27N19-1 contained the desired clone N27N19-1 comprising 
clones N27-3 and N19-1 ligated without overlapping. The base sequence of clone N27N19-1 is shown in 
SEQ ID NO 28. 

[3] Preparation of Clones N27MX24A-1 and N27MX24B-1 

Overlapping clones N27-3 and either of clones N19MX24A-1 and N19MX24B-1 were ligated by taking 
advantage of unique restriction site which exists in the overlapping regions of the both clones. Upon 
digestion with restriction enzyme Mlul, clone N27-3 is cleaved at the 3* site of a nucleotide No. 330 (A) and 
clones N19MX24A-1 and N19MX24B-1 at the 3' site of a nucleotide No.71 (A). The ligation of clones was 
accomplished on the basis of an assumption that plasmids pUCN27-3, pUCN19MX24A-1 and 
pUCN19MX24B-1 contain each DNA fragment in the same orientation. Thus, plasmid pUCN27-3 was 
digested with Hindlll and Mlul to isolate a 363 bp DNA fragment which comprises a Hindlll-Smal DNA 
fragment of plasmid pUC19 attached to the 5' end of the clone N27-3, a cDNA derived from serum" of HC 
patient. The DNA fragment was then exchanged with a 363 bp DNA fragment of clone N19MX24A-1 or 
N19MX24B-1 which were excised from plasmids pUCN19MX24A-1 and pUCN19MX24B-1 with Hindlll and 
Mlul restriction enzymes, followed by cloning and screening. The resultant plasmids pUCN27MX24A-1 and 
PUCN27MX24B-1 contained the desired clones N27MX24A-1 and N27MX24B-1 , each comprising a clone 
N27-3 and either of clones N19MX24A-1 and N19MX24B-1 ligated without overlapping. The base s - 
quences of clones N27MX24A-1 and N27MX24B-1 are shown in SEQ ID NO 31 and 32, respectively. 

Example 12 
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Modification of DNA for the Expressi on of HCV Polypeptide Encoded by Clones N27MX24A-1 and 
N27MX24B-1 ~ 

M Modification of DNA for the Expression of HCV Polypeptide Encoded by Clones N27MX24A-1 and 
5 N27MX24B-1 in E.col i " ~ 

Clones N27MX24A-1 and N27MX24B-1 appeared to encode an open reading frame from the nucleotide 
No.2 (C) derived from HCV gene, which can be expressed by inserting an ATG initiation codon inframe and 
upperstream from said gene so that the expression of the DNA might be properly effected in host cells. The 
10 insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene may be accompanied 
by an addition of a foreign polypeptide which is not encoded by HCV gene to the N' terminus (amino 
terminus) of an amino acid sequence of SEQ ID NO 31 or 32. When an expression vector containing an 
initiation codon for E. coli . is used, a DNA fragment from the clone is ligated to the vector such that frame 
of said DNA is in fonfirmity with that of the ATG codon. The modification of DNA can be carried out by 
75 PCR. The modification procedures are hereinafter illustrated using clone N27MX24A-1. It will be appreciated 
that clone N27MX24B-1 can be modified just in the same manner. 

The following synthetic oligonucleotide primers were used. 
5' primer: 

MS2724-1; 5" GCAAGCTTATGCGGATCCCACAAGCCGTGGTGGAT 3' (SEQ ID NO 152) 
20 5' primer for inserting DNA fragment into a vector containing initiation codon. 
MS2724-2; 5' CGGATCCCACAAGCCGTGGTGGAT 3' (SEQ ID NO 153) 
3' primer: 

MS2724-3; 5' GCGAATTCAGATCTTCATCACTCTAAGGTGGCGTCGGCGTGGG 3' (SEQ ID NO 154) 
The synthetic DNA was adjusted to 20 pmol/ml before use. 

25 PCR was carried out in the same manner as described in the above according to Saiki's method in a 
total volume of 100 ul containing 100 ng of plasmid pUCN27MX24A-1, as a template, and 2 ul each of 3' 
and 5' primers. The reaction mixture was heated at 95 °C for 5 min and quenched at 0 °C. One minute 
later, to the mixture was added 0.5 ul of Taq DNA polymerase (7 U/ml, AmpliTaq™ Takara Shuzo), mixed 
thoroughly and overlaid with mineral oil. The sample was reacted by repeating 25 cycles of treatments 

30 which comprises: at 95 °C for 1 minute; at 60 °C for 1 min; and at 72 °C for 5 min in DNA Thermal Cycler 
(Parkin Elmer Cetus). The resultant reaction solution was extracted with phenol/chloroform and precipitated 
with ethanol conventionally. The amplified DNA samples were digested with Hindlll and EcoRI (when 
MS2724-2 was used as 5'primer, the DNA was blunt ended with T4 DNA polymerase and digested with 
EcoRI), and fractionated on acrylamide gel electrophoresis and the gel containing a DNA fragment of 

35 desired length was extracted (Molecular Cloning, Cold Spring Harbor (1982)). 

The resultant DNA fragment was then ligated into Hindlll (when MS2724-2 was used as S'primer, Smal) 
and EcoRI sites of a cloning vector pUC19, cloned and screened to obtain plasmid pUCHN27MX24A : 1 
(plasmid pUCH2N27MX24A-1, when MS2724-2 was used). The resultant plasmid was sequenced. Clone 
CHN27MX24A-1 comprises a DNA fragment shown by a base sequence of SEQ ID No 31 , 32 except that 

40 the 5' terminal C was removed and the following DNA fragment: 



5' GCAAGCTTATG 3' 
45 3' CGTTCGAATAC 5' (SEQ ID NO 155) 

which comprises a Hindlll restriction site followed by an initiation codon ATG, was added thereto, and 3' 
terminal two bases (AA) were removed from the base sequence of SEQ ID NO 31 and the following DNA 
50 fragment: 



5' TGATGAAGATCTGAATTCGC 3' 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 
which comprises two termination codons, and EcoRI sites from 5' to 3', was added thereto. 
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Another clone H2N27MX24A-1 obtained using primers MS2724-2 and MS2724-3 was sequenced 
showing that said clone has no additional DNA fragment at the 5' terminus but, at the 3' terminus, has the 
same additional DNA fragment as that of the above clone HN27MX24A-1. 

[2] Modification of a DNA Fragment for the Expression of HCV Polypeptide Comprising 106 Amino Acid 
Sequence from No. 109 to 214 of SEQ ID NO 31, 32 in E.coli ~ — 

A DNA fragment encoding a polypeptide comprising 106 amino acid sequence from Nos. 109 to 214 
amino acids of SEQ ID NO 31, 32 appeared to encode an open reading frame (ORF) from HCV gene, which 
can be expressed by inserting an ATG initiation codon in frame and upperatream from said gene. The 
insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene may be accompanied 
by an addition of a foreign polypeptide to the N' terminus (amino terminus) of said polypeptide. When an 
expression vector containing an initiation codon for E. coii . is used, a DNA fragment from the clone is 
ligated to the vector such that the frame of said DNA is in confirmity with that of the codon. The 
modification of DNA can be carried out by PCR using the following synthetic oligonucleotide primers. 
5' primer: 

MSHNS1-1: 5' GCAAGCTTATGTTCAACGCGTCCGGATGTCCGGA 3' (SEQ ID NO 157) 
5' primer for inserting DNA fragment into a vector containing initiation codon. 
MSHNS1-2: 5' TTCAACGCGTCCGGATGTCCGGA 3' (SEQ ID NO 158) 
3' primer: 

MSHNS1-3: 5' G CG AATTC AG ATCTTCATC AAC AACCG AACC AGTTGCCCTG CG 3' (SEQ ID NO 159) 
The synthetic DNA was adjusted to 20 pmol/ml before use. 

PCR was carried out in the same manner as described in the above according to Saiki's method in a 
total volume of 100 ul containing 100 ng of plasmid pUCN27MX24A-1 (or plasmid pUCN27MX24B-1), as a 
template, and 2 ul each of 3' and 5* primers. The reaction mixture was heated at 95 °C for 5 min and 
quenched at 0 °C. One minute later, to the mixture was added 0.5 ul of Taq DNA polymerase (7 U/ml, 
AmpliTaq™ Takara Shuzo), mixed thoroughly and overlaid with mineral oil. The sample was reacted by 
repeating 25 cycles of treatments which comprises: at 95 °C for 1 minute; at 60 °C for 1 min; and at 72 °C 
for 5 min in DNA Thermal Cycler (Parkin Elmer Cetus). The resultant reaction solution was extracted with 
phenol/chloroform and precipitated with ethanol conventionally. The amplified DNA samples were digested 
with Hindlll and EcoRI (when MSHNS1-2 was used as S'primer, the DNA was blunt ended with T4 DNA 
polymerase and digested with EcoRI), and fractionated on acrylamide gel electrophoresis and extracted 
(Molecular Cloning, Cold Spring Harbor (1982)). 

The resultant DNA fragment was then ligated into Hindlll (when MSHNS1-2 was used as 5'primer, 
Smal) and EcoRI sites of a cloning vector pUC19, cloned "and screened to obtain plasmid pUCH48 (plasmid 
PUCH48-2, when primers MSHNS1-2 and MSHNS1-3 were used). The resultant plasmid was sequenced, 
demonstrating that the clone H48 has a modified base sequence of SEQ ID NO 31 , 32 wherein, at the 5' 
site of No. 326 T, the following DNA fragment: 



5' GCAAGCTTATG 3' 

3' CGTTCGAATAC 5' (SEQ ID NO 155) 

which fragment comprises a Hindlll restriction site at 5* terminus and ATG initiation codon, followed by an 
initiation codon ATG, was added, and, at the 3' terminus, the following DNA fragment: 



5' TGATGAAGATCTGAATTCGC 3' 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 

which fragment comprises two termination codons, Bglll and EcoRI sites from 5' to 3', was added. 

Another clone H48-2 obtained using primers MlMNS1-2~and MSHNS1-3 was sequenced showing that 
said clone has no additional DNA fragment at the 5' site of No. 326 T, while has the same additional DNA 
fragment as that of clone H48. 
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[3] Modification of a DNA Fragment for the Expression of HCV Polypeptide Comprising 92 Amino Acid 
Sequence from No. 233 to 324 of SEQ ID NO 31 , 32 in E.coli 



A DNA fragment encoding a polypeptid comprising 92 amino acid sequence from Nos. 233 to 324 
5 amino acids of SEQ ID NO 31 , 32 appeared to encode an open reading frame (ORF) from HCV gene. The 
modification of DNA fragment was conducted in the same manner as that used for the modification of DNA 
fragment encoding a polypeptide of 106 amino acid sequence from amino acid Nos. 109 to 214 of SEQ ID 
NO 31, 32 in the above [2] except that the following primers were employed. 
5' primer: 

w MSNS1-4: 5* GCAAGCTTATGATCGGGGGGGTCGGCAACAATAC 3' (SEQ ID NO 160) 
5' primer for inserting DNA fragment into a vector containing initiation codon. 
MSNS1-5: 5* ATCGGGGGGGTCGGCAACAATAC 3' (SEQ ID NO 161) 
3' primer: 

MSNS1-6: 5 f GCGAATTCAGATCTTCATCAAAGCTCTGATCTATCCCTGTCCT 3' (SEQ ID NO 162) 
75 Each synthetic DNA was adjusted to 20 pmole/ul. 

The resultant clones are H49 (primers MSNS1-4 and MSNS1-6) and H49-2 (primers MSNS1-5 and 
MSNS1 -6). 

[4] Modification of DNA for the Expression of HCV Polypeptide Encoded by Clones N 27MX24A-1 , 
20 N27MX24B-1 , H48-2 and H49-2 in Insect Cells ™ 

Clones N27MX24A-1 and N27MX24B-1 appears to contain an ORF which starts from the nucleotide 
No.2 (C). Clones H48-2 and H49-2 contain an ORF which starts from the nucleotide No.1. For the 
expression of polypeptide encoded by these ORF, an initiation codon ATG is inserted in frame and at an 

25 appropriate site upperstream from said gene so that the expression of the DNA might be properly effected 
in insect cells. The insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene 
may be accompanied by an addition of a foreign polypeptide which is not encoded by HCV gene to the N' 
terminus (amino terminus) of amino acid sequence encoded by clones N27MX24A-1 , N27MX24B-1, H48-2 
and H49-2. When an expression vector containing an initiation codon for insect cells is used, a DNA 

30 fragment from the clone is ligated to the vector such that the frame of said DNA is in confirmity with that of 
the initiation codon on said vector. It also can be accompanied by an addition of a foreign polypeptide 
which is not encoded by HCV gene to the N* terminus (amino terminus) of amino acid sequence encoded 
by clones N27MX24A-1, N27MX24B-1, H48-2 and H49-2. The modification of vector DNA was carried out 
by PCR. Although the modification procedures are described using clone N27MX24A-1 , it can be conducted 

35 as well using clone N27MX24B-1. When insect cells were transfected with the DNA and cultivated, clones 
N27MX24A-1, N27MX24B-1 were expressed as in the fused form as a precursor, which was then 
processed, glycosylated incompletely to give a mature glycoprotein of about 70 kD accumulated intracel- 
lurarly. The modification of DNA of clones N27MX24A-1, N27MX24B-1, H48-2 and H49-2 was carried out by 
PCR using the following synthetic DNA as primers. 

40 5* primers: 

MS2724-4: 5' GCGTCGACGCTAGCATGCGGATCCCACAAGCCGTGGTGGAT 3* (SEQ ID NO 163) 
MSNS1-7: 5' GCGTCGACGCTAGCATGTTCAACGCGTCCGGATGTCCGGA 3* (SEQ ID NO 164) 
MSNS1-8: 5 1 GCGTCGACGCTAGC ATG ATCGGGGGGGTCGGCAACAATAC 3' (SEQ ID NO 165) 
3' primer: 

45 MS2724-5: 5' GCGAATTCGCTAGCTCACTCTAAGGTGGCGTCGGCGTGGG 3* (SEQ ID NO 166) 
MSNS1-9: 5' GCGAATTCGCTAGCTCAACAACCGAACCAGTTGCCCTGCG 3' (SEQ ID NO 167) 
MSNS1-10: 5' GCGAATTCGCTAGCTCAAAGCTCTGATCTATCCCTGTCCT 3* (SEQ ID NO 168) 
These three synthetic DNAs were separately adjusted to 20 pmol/ml before use. 
The PCR was carried out using the same reaction solution and worked up in the same manner as 
so described in the above [1] except that plasmid pUCHN27MX24A-1 (primers MS2724-4 and MS2724-5), 
pUCH48 (primers MSNS1-7 and MSNS1-9) or pUCH49 (primers MSNS1-8 and MSNS1-10) was used as a 
template plasmid. PCR was accomplished by repeating 10 times of reaction cycles consisting of: 1 min at 
95 °C; 1 min at 50 °C and 5 min at 72 °C ; and then 20 tim s of reaction cycles consisting of: 1 min at 95 
°C; 1 min at 65 °C; 5 min at 72 °C. When plasmid pl)CHN27MX24A-1. as a template DNA, and primers 
55 MS2724-4 and MS2724-5 were used, a desired 1268 bp DNA fragment was obtained. The other combina- 
tion of plasmid pUCH48 and primers MSNS1-7 and MSNS1-9 gave a desired 352 bp DNA fragment and 
that of pUCH49 and primers MSNS1-8 and MSNS1-10 gave a desired 322 bp DNA fragment. 

Each DNA fragment was digested with Nhel, fractionated on acrylamide gel electrophoresis and 
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extracted by convenional means (Molecular Cloning, Cold Spring Harbor (1982)) to obtain a DNA fragment 
of desired length. The resultant DNA fragment was then ligated into Nhel restriction site of a transfer v ctor 
pBlueBac (Invitrogen), cloned and screened for a clone which contains a single DNA fragment inserted at 
Nhe l site. Thus, plasmids pBlueN27MX24A-1 derived from 1268 bp DNA obtained by primers MS2724-4 
5 and MS2724-5, pBlueH48 derived from 352 bp DNA fragment obtained by primers MSNS1-7 and MSNS1- 
9, and pBIueH49 derived from 322 bp DNA fragment obtained by primers MSNS1-8 and MSNS1-10 were 
prepared. 

According to the teaching shown in the protocol given by Invitrogen. the expression unit of these 
plasmid contains DNA fragment derived from HCV gene oriented forward and ligated to the Nhel cloning 
to site downstream from a poyhedrin promoter. 

Example 13 



Expression of HCV Polypeptides Encoded by Clones HN27MX24A-1, HN27MX24B-1, H2N27MX24A-1 , 
is H2N27MX24B-1 , H48, H48-2, H49, and H49-2 

[1] Expression of Polypeptide Encoded by Clone HN27MX24A-1 , HN27MX24B-1, H48, or H49 in E.coli 

Each clone encodes a part of polypeptide encoded by cDNA originated from serum of HC patient. The 

20 polypeptide encoded by each clone was expressed directly in E.coli, as it is, by subcloning said clone into 
an expression vector pCZ44 (Japanese Patent Publication No. 124387/1989). 

A clone was digested thoroughly with restriction enzymes Hindlll and Bglll, extracted with 
phenol/chloroform, precipitated with ethanol, separated on acrylamide geTelectrophoreili. From the gel was 
extracted a larger DNA fragment having cohesive Hindlll- and Bglll-restricted ends (Molecular Cloning, Cold 

25 Spring Harbor, 1982). The expression vector pCZ44was digested with Hindlll and Bglll. The larger fragment 
containing a region functional for the replication in E.coli was separatedTlreated in the same manner, ligated 
to the Hindlll-Bglll fragment obtained from a clone such that the vector contains only one insertion, and 
cloned conventionally. The resultant plasmids were designated as plasmid pCZ2724A-1, pCZ2724B-1 f 
pCZ48 andpCZ49 after clones HN27MX24A-1, HN27MX24B-1, H48, or H49. respectively. 

30 Alternatively, an expression vector was constructed using an expression vector pGEX-2T (Pharmacia), 
in stead of pCZ44, for the expression of a fused protein between a desired polypeptide and GST. The 
construction was carried out substantial in accordance with the protocol taught by the manufacture 
(Pharmacia). Thus, the expression vector pGEX-2T was digested with BamHI. The linearized vector was 
ligated with a Hindlll linker, and ligated with a Hindlll-EcoRI DNA fragment prepared from a clone to yield 

35 expression vectors pGEX2724A-1 , pGEX2724B-T~pGEX48 and pGEX49. 

E.coli JM109 strain transformed with plasmid pCZ2724A-1, pCZ2724B-1, pCZ48 or pCZ49 was grown 
in L-Broth at 37 °C overnight (Molecular Cloning, Cold Spring Harbor, 1982). The cultured broth was diluted 
50-folds by inoculating it into a freshly prepared L-Broth and the cultivation continued with shaking at 30 °C 
for 2 hr. At this time, IPTG was added to the culture to a final concentration of 2 mM and cultured for more 

40 than 3 hr in order to induce the expression of DNA encoding HCV-originated polypeptide by single-clone- 
derived transformants (E.coli JM 109 cells transformed solely by one plasmid derived from corresponding 
clone). Base sequences of cDNA contained in clones HN27MX24A-1 and HN27MX24B-1 , and amino acid 
sequences deduced therefrom are shown in SEQ ID NO 31 . Base sequences of cDNA contained in clones 
H48 and H49, and deduced amino acid sequence are shown by amino acid sequences from amino acid No. 

45 109 to 214 and from amino acid No. 233 to 324 of in SEQ ID NO 31. respectively. 

In the same manner as the above, clone HN27MX24B-1 can be used in stead of clone HN27MX24A-1 
to give a polypeptide encoded by said clone. The deduced amino acid sequence of the polypeptide is 
shown in SEQ ID NO 32. 

As mentioned in the above, plasmids pGEX2724A-1 , pGEX2724B-1 , pGEX48 and pGEX49 can be used 
so to obtain transformants capable of expressing a fused protein include desired polypeptide and GST. The 
plasmid encodes a fused protein GST-CORE comprising GST, which has a thrombin-cleaving site at its C- 
terminus, and a polypeptide derived from a clone HN27MX24A-1, HN27MX24B-1, H48 or H49. Fused 
prot in comprising at C-terminal region a HCV polypeptide was produced in E.coli transformant transformed 
with either of plasmids pGEX2724A-1, pGEX2724B-1. pGEX48 and pGEX49Tby culturing the cells in the 
55 same manner as that used to produce polypeptide in transformants harboring plasmid pCZ2724A-1, 
pCZ2724B-1, pCZ48 or pCZ49 in the presence of IPTG. 

12] Expression of Polypeptides Encoded by Clones H2N27MX24A-1 , H2N27MX24B-1 , H48-2, and H49-2 
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Polypeptides were expressed in E.coli using cDNA contained in clones H2N27MX24A-1 , H2N27MX24B- 
1, H48-2, and H49-2 obtain d from serum of HC patient in the same manner as the above [1]. The cDNA 
used was that contained in clone H2N27MX24A-1 , H2N27MX24B-1 , H48-2, or H49-2, which clone had been 
previously isolated and sequenced. 

5 Expression plasmid for each clone was constructed using pOFA (Japanese Patent Publication (KOKAf) 
No.841 95/1 990). DNA fragment from each clone was blunt-ended with T4 DNA polymerase. The expression 
vector pOFA was digested with Kpnl and blunt-ended with T4DNA polymerase. Thus obtained DNA 
fragments were ligated, cloned and screened for a clone having a insertion of one DNA fragment. Thus, the 
desired plasmids pOFA2724A-1, pOFA2724B-1, pOFA48 and pOFA49 were prepared by subdoning a clone 

w so that the + strand capable of expressing HCV protein should be inserted appropriately for the correct 
translation of said strand. It was confirmed that the cDNA from HCV was properly reconstructed by the 
determination of base sequence and restriction enzyme mapping of each plasmid. 

Cultivation was carried out in the presence of IPTG in order to induce the expression of DNA encoding 
HCV-originated polypeptide by single-clone-derived transformants (E.coli JM 109 cells transformed solely 

75 by one plasmid). Base sequences of cDNAs derived from serum of a HC patient contained in clones 
H2N27MX24A-1 , H2N27MX24B-1 , H48-2 and H49-2 and amino acid sequences deduced therefrom are 
shown by the amino acid sequences of SEQ ID NO 31 , 32, amino acid sequence from No. 109 to 214 of 
SEQ ID NO 31, and that from No. 233 to 324 of SEQ ID NO 31, respectively. 

20 [3] Expression of Polypeptide Encoded by Clones N27MX2 4A-1 , N27MX24B-1 , H48-2 and H49-2 in Insect 
Cells ~~ — 

The expression of HCV-originated glycoprotein encoded by plasmid pBlueN27MX24A-1 , pBlueH48 and 
pBlueH49 prepared in Example 12 [4] was conducted substantial in accordance with a known expression 

25 manual for baculovirus (MAXBAC™ BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4, 
hereinafter, referred to as Maxbac, Invitrogen). 

Plasmids pBlueN27MX24A-1 , pBlueH48 and pBlueH49 prepared in Example 12 [4] by inserting DNA 
fragment containing HCV gene at the Nhel site of a transfer vector pBlueBac (Maxbac, pp.37), were 
recovered from E.coli host cells transformed thereby, and purified according to the method of Maniatis et 

30 aI.(Molecular Cloning, Cold Spring Harbor Laboratory, pp.86 - 96 (1982)). Thus, a large amount of HCV 
gene-containing transfer plasmid DNA was obtained. Sf9 cells were cotransfected with 2 ug of a plasmid 
containing a DNA fragment from HCV gene and 1 ug of AcNPV viral DNA (Maxbac, pp.27). Sf9 cells were 
grown in TMN-FH medium (Invitrogen) containing 10% FCS (fetal calf serum) in a 6 cm dish until a cell 
density reached to about 2 x 10 6 /plate. The TMN-F medium was removed and a 0.75 ml Grace medium 

35 (Gibco) containing 10% FCS was added thereto. To the DNA mixture described in the above was added 
0.75 ml of transfection buffer (attached to the kit) was thoroughly mixed by vortex and gradually added 
dropwise onto the Grace medium. After the culture being allowed to stand for 4 hr at 27 °C, Grace medium 
was replaced with 3 ml of TMN-FH medium containing 10% FCS and the dish incubated at 27 °Cfor 6 days. 
Three days from the incubation, there observed a few multinucleate cells and on sixth day, almost all the 

40 cells were multinuclear. The supernatant was taken into a centrifuging tube and centrifuged at 1,000 rpm, 
10 min to obtain the supernatant as a cotransfected viral solution. 

The cotransfected viral solution contains about 10 8 viruses/ml and 0.5% of which were recombinant 
viruses. The isolation of recombinant virus was carried out by a plaque isolation method described below. 
Thus, cells were adsorbed onto a 6 cm dish by seeding 1.5 x 10* cells on medium and removing the 

45 medium completely. To the dish was added 100 ul of a diluted viral solution (10"* and 10~ 5 folds), 
separately and incubated at room temperature for 1 hr while slanting the dish every 15 min to spread the 
virus extensively. X-gal medium containing agarose was prepared by adding 5-bromo-4-chloro-3-indolyl-j8- 
D-galactoside to a final concentration of 150 ug/l (Maxbac, pp.1 6-1 7) to a warm medium which had been 
prepared by autoclaving 2.5% baculovirus agarose (invitrogen) at 105 °C for 10 min, mixing with TMN-FH 

so medium containing 10% FCS preheated at 46 °C at the mixing ratio of 1 : 3, and keeping the temperature 
at 46 °C. 

After the completion of infection, virus solution was aspirated thoroughly from the dish and 4 ml of the 
warm X-gai medium containing agarose (previously prepared) was gently added to every dish not to peel 
off cells. The dish kept open by slightly sliding a fid until the agarose solidified and dried, and thereafter the 
55 dish covered, turned upside down, and incubated at 27 °C for 6 days. The plaques were observed under a 
phase difference microscope to find blue plaques which do not form multinucleate ceils. Agarose containing 
blue recombinant plaques were removed with a Pasteur pipet and suspended into 1 mi of TMN-FH medium 
by pipetting many times. The above process whcih comprises: infection, 6-day incubation, and isolation of 
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virus containing transfer plasmid DNA is called the "plaque method". The plaque method was repeated 
using 100 ul of viral suspension. After repeating said proc ss three times, there obtained a recombinant 
virus having a gene encoding HCV glycoprotein free from contamination with that of wild-type strain. 

A viral solution of the primary recombinant virus was prepared by aspirating plaques with a Pasteur 
5 pipet, and mixing thoroughly with 1 ml of TMN-FH medium. Because the primary viral solution was low in 
virus density for infection, it required further treatments for concentration. Thus, 100 ul of viral solution was 
adsorbed onto Sf9 cells grown in a 6 cm dish to a semi-confluent, and 4 ml of TMN-FH medium was added 
thereto and incubated three days. The culture supernatant was recovered to yield a recombinant viral 
solution for infection. 

w For the production of HCV structural protein , a suspension of Sf9 cells in TMN-FH medium containing 
10% FCS (5 x 10 6 cells/10 ml medium) was added into a 9 cm dish and kept 1 hr for adsorption. After the 
removal of medium, 250 ul of recombinant viral solution was added to the dish and spread extensively. To 
the dish was added 10 ml TMN-FH medium containing 10% FCS and incubated at 27 °C for 4 days. The 
cells expressing recombinant glycoprotein of HCV were harvested by scraping up and suspended into 

75 1,000 ml of phosphate buffered saline. 

Thus, HCV-derived glycoprotein was expressed in Sf9 cells transfected with said virus. 

Example 14 



20 Identification of Expression Products as HCAg 

Each expression product obtained in Example 13 was identified as HCAg because it reacted im- 
munologically with antiserum obtained from HC patients. 
Identification was conducted by Western blot technique. 

25 E. coli cells transformed with either of expression plasmids pC22724A-1, pCZ2724B-1, pCZ48, pCZ49, 
pOFA2724A-1, pOFA2724B-1, pOFA48, pOFA49, pGEX2724A-1 f pGEX2724B-1, pGEX48, pGEX49, 
pBlueN27MX24A-1 , pBlueH48 and pBlueH49 for polypeptides encodid by clones HN27MX24A-1, 
HN27MX24B-1, H2N27MX24A-1 , H2N27MX24B-1, H48, H48-2, H49, and H49-2 were grown in the presence 
of IPTG for 3 hr or a overnight in the same manner as described in Example 13. 

30 Recombinant strains were harvested by centrifuging 1,000 ul of the cultured broth at 6,500 rpm, 10 min. 
The pellet was dissolved into a sample solution (50 mM Tris-HCI, pH6.8 containing 2% SDS, 5% 
mercaptoethanol, 10% glycerin, and 0.005% bromophenol blue) for SDS-polyacrylamide gel electrophoresis 
to a final volume of 0.2 ml. Sf9 cells infected with viruses which had been treated more than 3 times by 
plaque method were collected by scraping up and suspended into 1,000 ml of PBS and 100 ul of the 

35 suspension was centrifuged at 6,500 rpm, 10 min to pellet the cells. The pellet was dissolved into a sample 
solution for SDS-polyacrylamide gel electrophoresis to a final volume of 0.2 ml. 

The sample solutions were then boiled at 100 °C for 10 min. Ten ul of the boiled solution was loaded 
onto 0.1% SDS-15% polyacrylamide gel (70 x 85 x 1 mm) together with a marker protein LMW Kit E (low- 
molecular weight marker protein, Pharmacia). Electrophoresis was carried out at a constant current of 30 

40 mA for 45 min in Tris buffer (25 mM Tris, pH 8.3, 192 mM glycine, 0.1% SDS) as electrode buffer. 
Thereafter, DNA was transferred electophoretically to a nitrocellulose filter by superposing the gel onto a 
filter BA-83 (S & S), impressing a constant current of 120 mA for about 20 min between gel (cathode) and 
the filter (anode) as conventionally. 

The transcribed filter was cut to remove a part containing a marker protein (referred to as marker filter) 

45 and that containing the sample (referred to as sample filter). The former was stained with 0.1% (w/v) 
amideblack 10B and the latter immersed into 0.01 M PBS (pH 7.4) containing 5% (w/v) bovine serum 
albumin (BSA). Serum from a HC patient was diluted 50 times with 0.01 M PBS (pH 7.4) containing 5% 
(w/v) BSA. To the sample filter was added 10 ul of diluted serum and the filter allowed to stand for 2 hr at 
room temperature. Thereafter, the filter was washed with PBS containing 0.1% (v/v) Tween 20 for 20 min 

so (x3). 

The sample filter was then reacted with 10 ml of horseradish peroxidase conjugated anti human IgG 
(Gappel) at 37 °C for 1 hr and washed with PBS containing 0.1% (v/v) Tween 20 for 20 min (x3). The filter 
was th n immersed into peroxidase-color-producing solution (60 mg 4-chioro-1-naphthol, 20 ml methanol, 
80 ml PBS, and 20 ul aqueous hydrogen peroxide). The colored filter was washed with distilled water and 
55 compared with the marker filter, demonstrating that colored prot in expressed by transformants transformed 
with plasmid pCZ2724A-1, pCZ2724B-1, pCZ48 and pCZ49 had a reasonable molecular weight as an 
expression product of inserted HCV gene and was identified as HCAg. 

The expression product from host cells transformed with pOFA-d rived plasmids such as pOFA2724A- 
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1, pOFA2724B-1, pOFA48 or pOFA49 is a fused protein consisting of HCV originated polypeptide and 
OmpF signal peptide of E.coli and the product from host cells transformed with pGEX-d rived plasmid such 
as pGEX2724A-1, pGEX2724B-1, pGEX48 or pGEX49 is also a fused protein consisting of HCV originated 
polypeptide and GST and thrombin cleaving site wherein the latter two attached at th N-terminus of th 
former. Each fused protein has a reasonable molecular weight and was also identified as HCAg. 

Insect cells transfected with pBlueN27MX24A-1 and pBlueN27MX24B-1, as shown in Example 13 [3], 
expressed HCV polypeptide encoded by clones N27MX24A-1, N27MX24B-1 , H48-2, and H49-2. The 
expression product (M-gp70) was a glycoprotein of molecular weight of about 70 kD, which has a base 
sequence corresponding to the base sequence from about No.46 to about No. 395 of SEQ ID NO 31 or 32. 
Also glycoprotein of HCV was expressed in insect cells tranformed with plasmid pBlueH48 or pBlueH49. 
Thus produced glycoproteins were encoded by clones H48-2 and H49-2 and had amino acid sequences 
which correspond to a polypeptide having 106 amino acids from No. 109 to 214 and that of 96 amino acids 
from No. 233 to 324 of SEQ ID NO 31 and 32, respectively. As a result, theses glycoproteins were 
identified as HCAg. 

Example 15 



Synthesis of DNA 

[1] Preparation of RNA Sample Solution 

RNA sample solution was prepared by resolving the dried nucleic acid obtained in Example 1 in 30 u! 
of water containing 10 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo, Japan). 

Oligonucleotide primers of the following base sequences were synthesized using a method well known 
to one of skill. Among them, antisence primers such as MS49, MS88, MS100, MS132, MS152, and MS158 
were used for cloning of cDNA. 

[2] Synthesis of cDNA Using Antisence Primer 

To 2 ul of RNA sample solution was added 1 ul of 15 pmol/ul anti-sense primer (e.g., synthetic DNA 
primer such as MS158, MS152, MS132, MS49, MS88, or MS100) 2 ul of 10 x RT buffer (100 mM Tris-HCI 
(pH8.3). 500 mM KCI), 4 ul of 25 mM MgCfe, 8 ul of 2.5 mM 4dNTPs, 1 ul of water and the mixture 
incubated at 65 °C for 5 min then at room temperature for 5 min. To the mixture was added 1 ul of reverse 
transcriptase (25 U, Life Science), 1 ul of ribonuclease inhibitor (100 U/ul, Takara Shuzo) and the mixture 
incubated at 37 °C for 20 min, at 42 °C for 30 min, and finally at 95 °C for 2 min, which was followed by an 
immediate cooling to 0 °C to yield cDNA. 

Amplification of DNA encoding HCAg was conducted by polymerase chain reaction (PCR) (Saiki et al., 
Nature 324: 126 (1986)). For the PCR, primers synthesized in the above were used as a pair of: MS48 - 
MS49; MS86 - MS100; MS97 - MS88; MS135 - MS132; MS155 - MS152; or MS151 - MS158. 

A 100 ul mixture containing ten ul of cDNA solution, 10 ul of 10 x PCR buffer (100 mM Tris-HCI 
(pH8.3), 500 mM KCI, 15 mM MgCI 2 , 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 2 ul of 15 pmol/ul synthetic 
primer (the same primer as used in the preparation of cDNA), 3 ul of 15 pmol/ul synthetic primer (a 
counterpart of pairs of primers) and water was incubated at 95 °C for 5 min, then immediately cooled to 0 
°C. One minute later, it was mixed with 0.5 ul of Taq DNA polymerase (7 U/ul. AmpliTaq™ Takara Shuzo) 
and overlaid with mineral oil. The resultant sample was then subjected to PCR. PCR was conducted by 
repeating 25 times of reaction cycle, which comprises the following treatments: at 95 °C for 1 min; at 40 - 
55 °C for 1 min; and at 72 °C for 1 - 5 min in DNA Thermal Cycler (Parkin Elmer Cetus). Finally, the 
reaction mixture was incubated at 72 °C for 7 min, which was followed by phenol/chloroform extraction and 
ethanol precipitation. The ethanol precipitation was carried out by adding 2.5 volumes of ethanol and either 
of about 1/10 volume of 3 M acetic acid or an equal volume of 4 M ammonium acetate to the aqueous 
solution, mixing, centrifuging at 15,000 rpm for 15 min using a rotor of about 5 cm in diameter under 
cooling at 4 °C to pellet the precipitates, and drying the pellet. Various amplified DNA fragments were 
obtained using different primers, for the cloning of cDNA and amplification thereof. 

Example 16 

Cloning and Sequencing of Amplified DNA Fragments 
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The cloning was carried out substantial in accordance with the method of Molecular Cloning, Cold 
Spring Harbor (1982). 

Dried DNA fragment (at least 1 pmole) obtained in the above Example 15 [2] was blunt-ended with T4 
DNA polym rase (Toyobo) and 5'-end phosphorylated with polynucleotide kinas (Toyobo) and ligated into 

5 smal site of multi-cloning sites of 5 ng to 10 ng of pUC19 cloning vector. The cloning vector had been 
previously treated as follows: digestion with a restriction enzyme Smal (Toyobo), phenol/chloroform- 
extraction, ethanol-precipitation, 5'-end dephosphorylation with alkaline phosphatase (Behringer-Manheim) 
(Molecular Cloning (1982) Cold Spring Harbor), phenol/chloroform extraction, and ethanol-precipitation. The 
ligated DNA was used to transfect into a competent E.coli JM 109 or DH5 cells (Toyobo). The transfection 

10 was carried out according to the protocol of the manufacture's instruction (COMPETENT HIGH, Toyobo). 
Plasmid clones were recovered from transformed cells conventionally. At least 20 transformants were 
obtained using pUC19 cloning vectors containing either of DNA fragments obtained using either of pairs of 
primers in the same manner as that described in the above Example 15 [2]. 

The determination of base sequence of DNA fragment was conducted by Fluorescent DNA Sequencer 

75 (GENESIS 2000, Dupont) using, as sequence primer, the following synthetic primers: 
5* d(GTAAAACGACGGCCAGT)3* (SEQ ID NO 143) and 

S'dfCAGGAAACAGCTATGAQS' (SEQ ID NO 144) for the + and - strands of DNA fragment to be 
sequenced. Base sequences of each clone are shown in SEQ ID NO 37 - 39, 44 - 55, 103 and 104. Clones 
belong to the same region of HCV gene were summarized and of which sequences are shown in SEQ ID 
20 NO 33 to 36. For example, clones MX25-1, MX25-2 and MX25-3 are summarized and shown by SEQ ID NO 
33. 

In the same manner, clones shown by SEQ ID NO 47 to 55 are summarized as clones shown by SEQ 
ID NO 34 to 36. 

In the Sequence Listings, base sequences shown in SEQ ID NO 33 to 39, 44 and 55 are those of + 
25 strand of HCV gene inserted into cloning vector to transfect into host cells. All the clones are double- 
stranded and a plasmid containing one clone and used for the sequencing of said clone is designated by 
adding a prefix "pUC" to the name of clone. Thus, plasmid containing clone MX25-1 is pUCMX25-1, 
containing MX25-2 is pUCMX25-2, containing MX25-3 is pUCMX25-3, and so on. 

The base sequence shown in the Sequence Listing represents a specific sequence of cDNA cor- 
30 responding to RNA isolated from serum of patient(s) suffering from HC and differs from that of cDNA 
obtained from RNA in serum of a healthy subject in the same manner. It was confirmed that cDNA prepared 
from RNA obtained from a healthy subject under more strict conditions, for instance, by repeating a reaction 
cycles in Example 15 [2] and [3] about 60 - 100 times (= about 3- or 4-folds) f did not show any homology 
in base sequence with those shown in SEQ ID NO 33 to 43. Consequently, base sequences of clones 
35 shown in SEQ ID NO 33 to 43 are specific for those obtained from serum of HC patient. 

The base sequences of DNA fragments were compared with a known base sequence of HCV gene. The 
fact that three clones MX25-1, MX25-2 and MX25-3 were obtained from serum of one HC patient in 
Example 9 [2] using primers MS155 and MS152 strongly suggests that there must be more than one virus 
in a patient. 

40 

Example 17 

Preparation of Fused Clones MX25026A-1, MX25026B-1, N16N15A-1 and N16N15B-1, U16N15A-1 
U16N15B-1, N23N15A-1.N23N15B-1, MX25N15A-1, and MX25N15~E£T 

[1] Preparation of Clones MX25026A-1 and MX25026B-1 

One til (about 0.5 to 1 ug/ul) each of DNA fragments from clones MX25-1 and 026-1 (prepared in 
Example 16) was added into a reaction mixture containing 10 ul of 10 x PCR buffer (100 mM Tris-HCI 

so (pH8.3), 500 mM KCI, 15 mM MgCI 2 , 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 ul each of 20 pmol/ul 
synthetic primers MS155 and MS158, and 76.5 ul of water. After an intimate mixing, the mixture was heated 
at 95 °C for 5 min, then immediately cooled to 0 °C. One minute later, to the mixture was added 0.5 ul of 
Taq DNA polymerase (7 U/ul, AmpliTaq™ Takara Shuzo), mixed and overlaid with mineral oil. The sample 
was then subjected to PCR. PCR was conducted by repeating 25 times of reaction cycle, which comprises 

55 the following treatments: at 95 °C for 1 min; at 37 °C for 1 min; and at 72 °C for 2 min in DNA Thermal 
Cycler (Parkin Elmer Cetus). It was followed by an incubation at 97 °C for 2 min. The mixture was 
immediately cooled to 0 °C, kept at the same temperature for 2 min, mixed with 0.5 ul of Taq DNA 
polymerase (7 U/ul, AmpliTaq™ Takara Shuzo), and overlaid with mineral oil. The sample was then treated 
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in the same manner as the above by repeating 25 times of reaction cycle, which comprises the following 
treatments: at 95 °Cfor 1 min; at 50 - 55 °C for 1 min; and at 72 °C for 2 min. After th final tr atment at 72 
°C for 7 min, the resultant reaction solution was treated with phenol/chloroform then precipitated with 
ethanol. The amplified DNA samples were fractionated on agarose gel electrophor sis and a gel containing 

5 a desired fragment having an expected length was removed (Molecular Cloning (1982) Cold Spring Harbor) 
to isolate the DNA fragment therefrom conventionally. The resultant DNA fragment was then modified at the 
N-terminal region as described in Example 16 and ligated into Smal site of multi-cloning sites of pUC19. 
cloned and screened as described in Example 3 to obtain plasmids pUCMX25026A-1 and pUCMX25026B- 
1. CDNAs derived from serum of HC patient contained in said plasmids were designated as clones 

70 MX25026A-1 and MX25026B-1 , respectively and of which base sequences are summarized in SEQ ID NO 
40. Base and deduced amino acid sequences of each clone MX25026A-1 and MX25026B-1 are shown in 
SEQ ID NO 56 and 57. Overlapping region in clone MX25026A-1 is derived from clone MX25-1 and that of 
MX25026B-1 from 026-1 . 

In the same manner as the above, clones N16N15A-1 and N16N15B-1, and U16N15A-1 and U16N15B- 
75 1 were prepared using clones N15-1 (SEQ ID NO 39) and either of N16 (SEQ ID NO 36) and U16-4 (SEQ 
ID NO 37). Base sequences of clones are summarized in SEQ ID NO 41 . Base and amino acid sequences 
of clones N16N15A-1 and N16N-15B-1 are shown in SEQ ID NO 26 and 27, respectively. 

[2] Preparation of Clone N16N15-1 

20 

Two overlapping clones N16-1 and N15-1 were ligated by taking advantage of unique restriction site 
which exists in the overlapping regions of the both clones. Upon digestion with restriction enzyme BstE11, 
clone N16-1 is cleaved at the 3" site of a nucleotide No. 576 and clone N15-1 at the 3' site of a nucleotide 
No.114. The ligation of A clones N16-1 and N15-1 were conducted on the basis of assumption that 

25 plasmids pUCN16-1 and pUCN15-1 contains each clone in the same orientation. As a result, a clone 
N16N15-1 in which clones N16-1 and N15-1 are ligated without overlapping was conducted. Thus, plasmid 
PUCN16-1 was digested with Hindlll and BstEII to obtain a 609 bp DNA fragment comprising a Hindlll-Smal 
DNA fragment of plasmid pUCl9 attached to the 5' end of clone N16-1 (a cDNA clone from serum of a~HC 
patient). Plasmid pUCN15-1 was digested with Hindlll and BstE11 to obtain a 147 bp DNA fragment 

30 containing clone N15-1. These 609 bp and 147 bp Hindlll-BltE1 1 fragments are then exchanged each 
other, cloned and screened to obtain plasmid pUCNieNI^T containing the desired clone N16N15-1. 
Clones obtainable in the same manner are summarized in SEQ ID NO 41. The base and amino acid 
sequences of clone N16N15-1 are shown in SEQ ID NO 60. 

35 [3] Preparation of Clones N23N15A-1 and N23N15B-1 

One ul (about 0.5 to 1 ug/ul) of DNA fragment from each clone N23-1 (Example 16), and N16N15A-1, 
N16N15-B-1 and N16N15-1 (Example 17 [1],[2]) was added to a reaction solution containing 10 ul of 10 x 
PCR buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI, 15 mM MgCfe, 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 

40 ul each of 20 pmol/ul synthetic primers MS135 and MS88, and 76.5 ul of water. After an intimate mixing, 
the mixture was heated at 95 °C for 5 min, then immediately cooled to 0 °C. One minute later, to the 
mixture was added 0.5 ul of Taq DNA polymerase (7 U/ul, AmpiiTaq™ Takara Shuzo), mixed and overlaid 
with mineral oil. The sample was then subjected to PCR. PCR was conducted by repeating 25 times of 
reaction cycle, which comprises the following steps: at 95 °C for 1 min; at 37 °C for 1 min; at 72 °C for 3.5 - 

45 4 min in DNA Thermal Cycler (Parkin Elmer Cetus). After the final incubation at 92 °Cfor 2 min, the mixture 
was immediately cooled to 0 °C, kept at the same temperature for 2 min, mixed with 0.5 ul of Taq DNA 
polymerase (7 U/ul, AmpliTaq™ Takara Shuzo), and overlaid with mineral oil. The sample was then treated 
in the same manner as the above by repeating 25 times of reaction cycle, which comprises the following 
treatments: at 95 °C for 1 min; at 50 - 55 °C for 1 min; and at 72 °C for 3.5 - 4 min. After the final treatment 

so at 72 °C for 7 min, the resultant reaction solution was treated with phenol/chloroform then precipitated with 
ethanol. The amplified DNA samples were fractionated on agarose gel electrophoresis and a gel containing 
a desired fragment having an expected length was removed (Molecular Cloning (1982) Cold Spring Harbor) 
to isolate the DNA fragment therefrom conventionally. Th resultant DNA fragment was then modified at the 
N-terminal region and ligated into Sma l site of the multi-cloning sites on pUCl9, cloned and screened as 

55 described in Example 16 to obtain plasmids pUCN23N15A-1 and pUCN23N15B-1. cDNAs obtained from 
these plasmids are designated as clones N23N15A-1 and N23N15B-1, whose base and deduced amino 
acid sequences are shown in SEQ ID NO 61 and 62, respectively and are summarized in SEQ ID NO 42. 
The overlapping region in clones N23N15A-1, a fused clone of N23-1, N16N15A-1, N16N15B-1 and 
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N16N15A-1, is originated from clone N23-1, and that of clone N23N15B-1 is originated from clones 
N16N15A-1, N16N15B-1 and N16N15-1. 

[4] Preparation of Clone MX25N15-1 

5 

Two overlapping clones MX25026A-1 and N23N15A-1 were ligated by taking advantage of a unique 
restriction site which exists in the overlapping regions of the both clones. Upon digestion with restriction 
enzyme Apal, clone MX25026A-1 was cleaved at the 3' site of C at nucleotide No. 1277 and clone 
N23N15A-1 at the 3* site of nucleotide No.17. A plasmid pUCMX25N15-1 in which clones MX25026A-1 and 

10 N23N15A-1 are ligated without overlapping was constructed in the following manner by taking advantage of 
the fact that plasmids pUCMX25026A and pUCN23N15A-1 contain clones MX25026A-1 and N23N15A-1 in 
the same orientation at Sma l site of multi-cloning sites of pUC19. Plasmid pUCMX25026A-1 was digested 
with Hindlll and Apal to obtain a 1310 bp DNA fragment comprising a Hindlll-Smal DNA fragment of 
plasmid pUC19 attached to the 5* end of clone MX25026A-1 (a cDNA clonelrom serum of a HC patient). 

J5 Plasmid pUCN23N15A-1 was digested with Hindlll and Apal to obtain a 50 bp DNA fragment containing 
clone N15-1. These 1310 bp and 50 bp Hindlll-Apal fragments are then exchanged each other, cloned and 
screened to obtain plasmid pUCMX25N15-1 containing desired clone MX25N15-1. The base and amino 
acid sequences of clone MX25N15-1 shown in SEQ ID NO 63. 

The clones MX25026A-1 and N23N15A-1 are ligated by PCR. The resultant base sequences are 

20 summarized in SEQ ID NO 43. 

Example 18 



Modification of DNA for the Expression of HCV Polypeptide Encoded by MX25N15-1 

25 

[1] Modification of DNA for the Expression of HCV Polypeptide Encoded by clone MX25N15-1 in E.coli 

Clone MX25N15-1 appeared to contain multiple open reading frames each originated from HCV gene 
such as NS2 ORF (hereinafter, referred to as MK/NS2) from No. 7 (T) to 825 (G), and NS3 ORF (MK/NS3) 

30 from No. 826(G) to 2652(G) of base sequence of SEQ ID NO 43. Genes contained therein can be 
expressed by inserting an ATG initiation codon in frame and upperatream from said gene so that the 
expression thereof might be properly effected in host cells. When a partial DNA fragment derived from 
MK/NS2 or MK/NS3 is to be expressed, an ATG initiation codon and a termination codon are inserted 
upperatream and downstream from the DNA to be expressed, respectively, such that the frame of each 

35 inserted codon is in confirmity with that of the DNA. The insertion of an ATG initiation codon at the 
upperstream from 5" terminus of a gene may be accompanied by an addition of a foreign polypeptide which 
is not encoded by HCV gene to the N' terminus (amino terminus) of an amino acid sequence of SEQ ID NO 
43. When an expression vector containing an initiation codon for E. coli. is used, a DNA fragment from the 
clone is ligated to the vector such that the frame of said DNA is in confirmity with that of the codon. The 

40 modification of DNA can be carried out by PCR. 

For the expression of MK/NS2, the following synthetic oligonucleotide primers were used. 
5' primer: 

MSNS2-1: 5' GCAAGCTTATGTGGTTGTGGATGATGCTGCTG 3' (SEQ ID NO 169) 

5' primer for the insertion of said DNA fragment into a vector having an initiation codon from a procaryotic 
45 expression vector: 

MSNS2-2: 5' TGGTTGTGGATGATGCTGCTG 3' (SEQ ID NO 170) 
3' primer: 

MSNS2-3: 5' GCGAATTCAGATCTTCATCACCTCCGGGCGGAGACNGGNAGNCC 3' (SEQ ID NO 171) 
The synthetic DNA was adjusted to 20 pmol/ml before use. 

so PCR was carried out in the same manner as described in the above according to Saiki's method in a 
total volume of 100 ul containing 100 ng of plasmid pUCMX25N15-1 (or plasmid pUCMX25026A-1 ), as a 
template, and 2 ul each of 3 1 and 5' primers MSNS2-1 and MSNS2-3. The reaction mixture was heated at 
95 °C for 5 min and quench d at 0 °C. One minute later, to the mixtur was added 0.5 ul of Taq DNA 
polymerase (7 U/ml, AmpliTaq™ Takara Shuzo), mixed thoroughly and overlaid with mineral oil. The 

55 sample was reacted by repeating 25 cycles of treatments which comprises: at 95 °C for 1 minute; at 60 °C 
for 1 min; and at 72 °C for 3 min in DNA Thermal Cycler (Parkin Elmer Cetus). The resultant reaction 
solution was extracted with phenol/chloroform, and precipitated with ethanol conventionally. The amplified 
DNA samples were digested with Hindlll and EcoRI, and fractionated on acrylamide gel electrophoresis and 
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extracted (Molecular Cloning, Cold Spring Harbor (1982)). 

The DNA fragment is then ligated into Hindlll (in case of MSNS2-2 primer, Smal site) and EcoRI sites of 
cloning vector pUC19 to obtain plasmid pUCHNS2-1. The base sequence of saidplasmid is determined to 
show that it comprises a DNA fragment shown by a bas sequence from No. 7 to 825 in SEQ ID No 43 
5 having additional DNA fragments attached to the both 5'- and 3'-termini. That is, at its S'-terminus. the 
following DNA fragment comprises a Hindlll restriction site followed by an initiation codon ATG was 
attached. 



10 5' GCAAGCTTATG 3' 

3' CGTTCGAATAC 5' (SEQ ID NO 155) 

15 And at its S'-terminus, the following DNA fragment comprises two termination codons, Bglll and EcoRI sites 
from 5' to 3' was attached. 



5' TGATGAAGATCTGAATTCGC 3' 

20 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 

For an expression vector containing E.coli-derived initiation codon, DNA was modified in the same 
25 manner as the above except that primers MSNS2-2 and MSNS2-3 are employed and the amplified DNA is 
first blunt-ended with T4 DNA polymerase and then digested with EcoRI instead of the digestion with Hindlll 
and Eco RI to obtain plasmid pUCH2NS2-1 . The sequencing of resultant clone H2NS2-1 showed thafiaid 
clone has no additional DNA fragment at the 5' terminus but, at the 3' terminus, has the same DNA 
fragment as that of the above clone HNS2-1 . 
30 Modification of DNA for the expression of MK/NS3 was conducted substantially in the same manner as 
the above except that primers MSNS3-1, 3-2 and 3-3 are used in stead of MSNS2-1, 2-2, and 2-3, 
respectively to obtain plasmids pUCHNS3-1 and pUCH2NS3-1, corresponding to the above plasmids 
PUCHNS2-1 and pUCH2NS2-1 . 

MSNS3-1 : 5* GCAAGCTTATGGGCAACGAGNTNCTNCTIGG 3' (SEQ ID NO 172) 
35 MSNS3-2: 5* GGCAACGAGNTNCTNCTNGG 3* (SEQ ID NO 173) 

MSNS3-3: 5' GCGAATTCAGATCTTCATCACTTCAGCCGTATGAGACACTT 3' (SEQ ID NO 174) 

[2] Modification of DNA for the Expression of DNA encoding HCV Polypeptide MK1 in E. coli 



40 DNA encoding MK1 polypeptide shown by 305 amino acid sequence from No. 422 to 726 in SEQ ID 
NO 43 was modified in the same manner as the above [1] by inserting an initiation codon ATG in frame and 
the upperstream from 5' terminus of said DNA in ORF encoding MK1. The insertion of an ATG initiation 
codon at the upperstream from 5' terminus of said gene may be accompanied by an addition of a foreign 
polypeptide which is not encoded by HCV gene. When an expression vector containing an initiation codon 

45 for E. coli. is used, a DNA fragment from the clone is ligated to the vector such that the frame of said DNA 
is in confirmity with that of the codon. The modification of DNA can be carried out by PCR using the 
following synthetic oligonucleotide primers. 
5' primer: 

MSMK1-1: 5' GCAAGCTTATG CTGTCGCCCGGGCCCATCTC 3' (SEQ ID NO 175) 
so 5' primer for the insertion of a DNA fragment into a vector having an initiation codon from procaryotic 
expression vector: 

MSMK1-2: 5 f CTGTCGCCCGGGCCCATCTC 3' (SEQ ID NO 176) 
3' primer: 

MSMK1-3: 5' GCGAATTCAGATCTTCATCAACATGTGTTGCAGTCGATCAC 3' (SEQ ID NO 177) 
55 The synthetic DNA was adjusted to 20 pmol/ml befor use. 

PCR, cloning and subcloning were carried out in the same manner as described in the above [1]. 

Plasmid pUCMKI was prepared by cloning a DNA fragment obtained by PCR using MSMK1-1 and 
MSMK1-3. The base sequence of clone MK-1 contained in plasmid pUCMKI is determined to show that it 
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comprises a DNA fragment having a base sequence from No. 1264 (G) to 2178 (G) in SEQ ID No 43 having 
additional DNA fragments attached to the both 5 1 - and 3'-terminL That is, at its S'-terminal G, a DNA 
fragment comprises a Hindlll restriction site followed by an initiation codon ATG as follows: 



5' GCAAGCTTATG 3' 

3' CGTTCGAATAC 5' (SEQ ID NO 155). 

And at its 3Merminal G, a DNA fragment comprises two termination codons, Bglll and EcoRI sites from 5* to 
3' as follows: 



5' TGATGAAGATCTGAATTCGC 3' 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 

Another plasmid pUCMK1-2 was constructed in the same manner as the above except that primers 
MSMK1-2 and MSMK1-3 were employed. The sequencing of resultant clone MK1-2 showed that said clone 
has no additional DNA fragment at the 5* terminus but has the same additional DNA fragment as that of the 
above clone MK1 at its 3' terminus. 

[3] Modification of DNA for the Expression of DNA Encoding HCV P olypeptide MK2 

MK2 polypeptide shown by 322 amino acid sequence from No. 712 to 1033 in SEQ ID NO 43 appears 
to be HCV-derived antigenic protein which is highly reactive with antiserum from a HC patient. For the 
expression of DNA encoding MK2 in E.coli, said DNA was modified in the same manner as the above [2] 
using the following synthetic oligonucleotide primers. 
5' primer: 

MSMK2-1: 5* GCAAGCTTATG GGCTATACCGGNGACTTNG AC 3' (SEQ ID NO 178) 

5' primer for the insertion of a DNA fragment into a vector having an initiation codon from procaryotic 
xpression vector: 
MSMK2-2: 5' GGCTATACCGG N G ACTTN G AC 3' (SEQ ID NO 179) 
3' primer: 

MSMK2-3: 5' GCGAATTCAGATCTTCAGTGCTTCGCCCAGAAGGT 3' (SEQ ID NO 180) 

The synthetic DNA was adjusted to 20 pmol/ml before use. The resultant clones were designated as 
clone MK2 (prepared using primers MSMK2-1 and MSMK2-3) and MK2-2 (prepared using primers MSMK2- 
2 and MSMK2-3). 

[4] Modification of DNA for the Expression of HCV Po lypeptide Encoded by Clone MX25N15-1 in Insect 
Cells ~ — ~ 



Clone MX25N15 contains an open reading frame which starts from the nucleotide No.1 (T). For the 
construction of expressing plasmids for insect cells, DNA was modified essentially in the same manner as 
described in the above by inserting an initiation codon ATG to the upperstream from 5* terminus of said 
DNA in ORF. The insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene 
may be accompanied by an addition of a foreign polypeptide which is not encoded by HCV gene to the N- 
terminus (amino-terminus) of the total or a part of amino acid sequence which is encoded by clone 
MK25N15. When an expression vector containing an initiation codon for insect cells is used, a DNA 
fragment from the clone is ligated to the vector such that the expression of said DNA can be initiated at the 
codon. In this case, the modification may be accompanied by an addition of a foreign polypeptide which is 
not encoded by HCV gene to the N-terminus (amino-terminus) of the total or a part of amino acid sequence 
which is encoded by clone MK25N15. When an insect cell transformed with an expression plasmid 
containing MK25N15, said clone was expressed as a precursor polypeptide having an amino acid 
sequence, which at least contains amino acids from No. 167 to 502, which was then processed, 
glycosylated and accumulated intracellurarly. 
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The modification of clone MX25N15 DNA was carried out by PCR employing the following synthetic 
oligonucleotides as primers. 
5* primer: 

MS2515-1: 5' GCGCTAGCATGTGGTTGTGGATGATGCTG 3' (SEQ ID NO 181) 
5 3' primer: 

MS2515-2: 5' GCGAATTCGCTAGCTCACAGCCGGTTCATCCACTGCAC 3' (SEQ ID NO 182) 
The synthetic DNA was adjusted to 20 pmol/ml before use. 

The PCR was carried out using the same reaction solution and worked up in the same manner as 
described in the above except that plasmid pUCMX25N15-1 was used as a template plasmid, primers 
io MS2515-1 and MS2515-2 were used as primers, and the PCR was carried out repeating 10 times the 
following reaction cycles consisting of: 1 min at 95 °C; 1 min at 50 °C and 5 min at 72 °C ; and then 
repeating 20 times of the following reaction cycle consisting of: 1 min at 95 °C; 1 min at 65 °C; 5 min at 72 
°C to obtain a desired 3586 bp DNA fragment 

The amplified DNA samples were digested with Nhel and fractionated on acrylamide gel electrophoresis 
75 and extracted as conventionally (Molecular Cloning, Cold Spring Harbor (1982)). The DNA fragment was 
then ligated into Nhel restriction site of a transfer vector pBlueBac (Invitrogen) , cloned and screened for a 
clone containing a single DNA insert at Nhel site conventionally to obtain a plasmid pBlueMX25N15-1. 

Taking account of the instruction provided by the manufacture (Invitrogen), the expression unit of the 
resultant plasmid contains a DNA fragment derived from HCV gene oriented forward and ligated to the Nhel 
20 cloning site down stream from a poyhedrin promoter. 

Example 19 



Expression of HCV Polypeptides Encod ed by MX25N15, and Polypeptides MK/NS2, MK/NS3, MK1 and 
25 MK3 " ~ 



[1 ] Expression of Polypeptide Encoded by Clones HNS2-1, HNS3-1, MK1 or MK2 



Clones HNS2-1, HNS3-1, MK1 and MK2 encode polypeptide fragments derived from a polypeptide 
30 encoded by cDNA obtained from a serum of a HC patient and were expressed as it is in E.coli. 

Construction of expression vector for each clone was carried out by subcloning it into an expression vector 

pCZ44 (Japanese Patent Publication No. 1-124387/1989). 

Each clone was digested thoroughly with restriction enzymes Hindlll and Bglll, extracted with 

phenol/chloroform, precipitated with ethanol, separated on acrylamide geTelectrophoreiTs, and extracted a 
35 DNA fragment having cohesive Hindlll- and Bglll-restricted ends from the gel (Molecular Cloning, Cold 

Spring Harbor, 1982). The expression vector pCZ44 was digested with Hindlll and Bglll and the larger 

fragment containing functional region for expressing DNA was separated and treated irTthe same manner. 

The both DNA fragments were ligated at their Hindlll and Bglll sites and cloned. The resultant plasmids 

were named plasmids pCZHNS2-1, pCZHNS3-1, pCZMKI and pCZMK2 after clones HNS2-1, HNS3-1, 
40 MK1 and MK2, respectively. 

Alternatively, expression vectors encoding polypeptides encoded by clones HNS2-1, HNS3-1, MK1 and 
MK2 were constructed using an expression vector pGEX-2T (Pharmacia) designed to express a fused 
protein of desired peptide and 0-glutathione-S-transferase (GST). The construction was carried out substan- 
tial in accordance with the protocol taught by the manufacture (Pharmacia). 

45 The expression vector pGEX-2T was digested with BamHI. To the linearized vector was ligated a Hindlll 
linker to obtain a DNA fragment having EcoRI and Hindlll restriction sites at its 3'- and 5Mermini.~Tfach 
clone was digested with Hindlll and EcoRI to obtain DNA fragments encoding desired HCV polypeptides. 
The two fragments were then ligated at their Hindlll and EcoRI sites to obtain expression vectors 
pGEXHNS2-1 , pGEXHNS3-1, pGEXMKI, and pGEXMK2, respectively. These plasmids contain DNAs 

so encoding GST, thrombin-cleaving sequence, and desired clone from upperstream to downstream. 

E.coli JM109 strain was transformed with a plasmid pCZHNS2-1, pCZHNS3-1, pCZMKI or pCZMK2 
and transformant was grown in L-Broth at 37 °C conventionally (Molecular Cloning, Cold Spring Harbor, 
1982). The cultured broth was inoculated into a fresh L-Broth to decrease the concentration to 1/50 and 
cultured with shaking at 30 °C for 2 hr. IPTG (isopropyl-jS-D-galactopyranoside) was added to the culture to 

55 a final concentration of 2 mM in order to induce exclusiv ly the expression of DNA by single-clone-derived 
transformant (E.coli transformed with a single plasmid) and cultivation continued for more than 3 hr. Thus, 
the transformant produced a polypeptide encoded by the clone. Deduced amino acid sequences of 
polypeptides encoded by cDNA derived from clones HNS2-1 , HNS3-1 , MK1 and MK2 are shown by amino 
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acid sequences from 422 to 726 and 712 to 1033 in SEQ ID NO 43. 

E.coli JM 109 ceils were transformed with expression v ctor pGEXHNS2-1, pGEXHNS3-1, pGEXMKI, 
or pGEXMK2 and cultured in the same manner as the above. The expression of gene encoding a fused 
polypeptide was induced by IPTG. The resultant fused protein comprises GST, a throm bin-cleaving site at 
5 its C-terminus, and a polypeptide derived from a clone HNS2-1 , HNS3-1, MK1 or MK2. 

[2] Expression of Polypeptides Encoded by Clones H2NS2-1, H2NS3-1, MK1-2, and MK2-2 

Clones H2NS2-1, H2NS3-1, MK1-2, and MK2-2, whcih had been isolated and sequenced, were 

70 expressed in E.coli to give an fused protein using a substantially the same manner as the above. 

The fused protein comprises, for instance, a signal peptide of OmpF, an outer membrane protein of 
E.coli, and a polypeptide encoded by either of the above-mentioned clones can be expressed using, as the 
expression vector, pOFA (Japanese Patent Publication No. 84195/1990). DNA fragment from each clone 
H2NS2-1, H2NS3-1, MK1-2, or MK2-2 was blunt-ended with T4DNA polymerase. The expression vector 

75 pOFA was digested with Kpnl and blunt-ended with T4DNA polymerase. Thus obtained DNA fragments 
were Ngated, cloned and screened for a clone having a insertion of one DNA fragment. Thus, the desired 
plasmids pOFANS2-1 , pOFANS3-1 , pOFAMKI , and pOFAMK2 were prepared by subcloning a clone so that 
the + strand responsible for the expression of HCV protein should be inserted appropriately for the correct 
translation of the clone. It was confirmed by the determination of base sequence and mapping of each 

20 plasmid that the HCV-derived cDNA was reconstructed properly. 

E.coli JM109 cells were transformed with an expression vector obtained in the above and induced the 
expression of DNA by growing host cells under the presence of IPTG as previously described. Transfor- 
mants expressed a fused protein of a signal peptide of OmpF and a HCV polypeptide encoded by each 
clone. DNA sequences and deduced amino acid sequences of polypeptides encoded by clones HNS2-1 , 

25 HNS3-1, MK1 and MK2 are shown by amino acid sequences from 422 to 726 and 712 to 1033 in SEQ ID 
NO 43. Thus, according to this method, HCV polypeptide was expressed as a fused protein between OmpF 
signal peptide and polypeptide encoded by each clone. 

[3] Expression of Polypeptide Encoded by Clone MX25N15 in Insect Cells 

30 

The expression of HCV polypeptide encoded by plasmid pBlueMX25N25 prepared in Example 18, [4] 
was conducted substantial in accordance with a known expression manual for baculovirus (MAXBAC™ 
BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4, hereinafter, referred to as Maxbac, 
Invitrogen). 

35 Plasmid pBlueMX25N15, a plasmid prepared by inserting DNA fragment containing HCV gene at the 
Nhel site of a transfer vector pBlueBac (Maxbac, pp.37) was recovered from E.coli/pBlueMX25N15, and 
purified according to the method of Maniatis et al.(Molecular Cloning. Cold Spring Harbor Laboratory, pp.86 
- 96 (1982)) to obtain a large amount of HCV gene-containing transfer plasmid DNA. Sf9 cells were 
cotransfected with 2 ug of plasmid pBlueMX25N15 and 1 ug of AcNPV viral DNA (Maxbac, pp.27). Thus. 

40 Sf9 cells were grown in TMN-FH medium (Invitrogen) containing 10% FCS (fetal calf serum) in 6 cm dish 
until a cell density reached to about 2 x 10 5 /plate. The TMN-F medium was removed and a 0.75 ml Grace 
medium (Gibco) containing 10% FCS was added thereto. A DNA solution of 2 ug plasmid pBlueMX25N15 
DNA and 1 ug AcNPV viral DNA in 0.75 ml of transfection buffer (attached to the kit) was thoroughly mixed 
by vortex and gradually added dropwise onto the Grace medium. After allowing to stand for 4 hr at 27 °C. 

45 the Grace medium was replaced with 3 ml of TMN-FH medium containing 10% FCS and the dish incubated 
at 27 °C for 6 days. Three days from the incubation, there observed a few multinucleate cells and on sixth 
day. almost all the cells were multinuclear. The supernatant was taken into a centrifuging tube and 
centrifuged at 1.000 rpm, 10 min to obtain the supernatant as a cotransfected viral solution. 

The cotransfected viral solution contains about 10 8 virus/ml and 0.5% of which were recombinant virus. 

so The isolation of recombinant virus was carried out by plaque isolation method described below. 

Thus, cells were adsorbed onto a 6 cm dish by seeding 1.5 x 10 e cells on medium and removing the 
medium. To the dish was added 100 ul of a diluted viral solution (10~ 4 and 10~ 5 folds), separately and 
incubated at room temperature for 1 hr while slanting the dish every 15 min to spread th virus extensively. 
X-gal medium containing agarose was prepared by adding 5-bromo-4-chloro-3-indolyl-0-D-galactoside to a 

55 final concentration of 150 ug/l to a warm medium which had been prepared by autoclaving 2.5% 
baculovirus agarose (Invitrogen) at 105 °C for 10 min, mixing with TMN-FH medium containing 10% FCS 
preheated at 46 °C at the mixing ration of 1 : 3, and keeping the temperature at 46 °C. 

After the completion of infection, virus solution was aspirated thoroughly from the dish and 4 ml of the 



46 



EP 0 518 313 A2 



warm X-gal medium containing agarose (previously prepared) was gently added to every dish not to peel 
off cells. The dish kept open by slightly sliding a lid until the agarose solidified and dried, and thereafter th 
dish covered, turned upside down, and incubated at 27 °C for 6 days. The plaques were observed under a 
phase difference microscope to find blue plaques which do not form multinucleate c lis. Agarose containing 

5 blue recombinant plaques were removed by an aspirating pipet and suspended into 1 ml of TMN-FH 
medium by pipetting many times. The above process whcih comprises: infection, 6-day incubation, and 
isolation of virus containing transfer plasmid DNA is called the "plaque method". The plaque method was 
repeated using 100 ul of viral suspension. After repeating said process three times, there obtained a 
recombinant virus having a gene encoding HCV glucoprotein free from contamination with wild-type strain. 

10 A viral solution of the primary recombinant virus was prepared by aspirating plaques with a Pasteur 
pipet, and mixing thoroughly with 1 ml of TMN-FH medium. Because the primary viral solution was low in 
virus density for infection, it was further treated. Thus, 100 ul of viral solution was adsorbed onto Sf9 cells 
grown in a petri dish (6 cm in diameter) to a semi-confluent, and 4 ml of TMN-FH medium was added 
thereto and incubated three days. The culture supernatant was recovered to yield a recombinant viral 

75 solution for infection. 

2. Infection of Sf9 Cells with Recombinant Viral Solution 

A suspension of Sf9 cells in TMN-FH medium containing 10% FCS (5x10* cells/10 ml medium) were 
20 added into a Petri dish (9 cm, in diameter) and kept 1 hr for adsorption. After the removal of medium, 250 
ul of recombinant viral solution was added to the dish and spread extensively. To the dish was added 10 ml 
TMN-FH medium containing 10% FCS and incubated at 27 uC for 4 days. The cells expressing 
recombinant glycoprotein of HCV were harvested by scraping up and suspended into 1,000 ml of 
phosphate buffered saline. Thus, HCV glycoproteins were expressed by Sf9 cells infected with said viral 
25 solution. 

Example 20 



Identification of Expression Product as HCAg 

30 

Each expression product obtained in Example 19 was identified as HCAg because it reacted im- 
munologically with antiserum obtained from HC patient. Identification of the expression product as HCAg 
was conducted by Western blot as follows. E. coli cells were transformed with either of plasmids described 
in Example 19 [1], [2], such as plasmids pCZHNS2-1, pCZHNS3-1, pCZMKI, pCZMK2 pGEXHNS2-1, 

35 pGEXHNS3-1 t pGEXMKI, pGEXMK2, pOFANS2-1, pOFANS3-1, pOFAMKI , and pOFAMK2 for polypep- 
tides encoded by clones HNS2-1, HNS3-1, MK1, MK2 HNS2-1, HNS3-1. MK1, MK2, H2NS2-1, H2NS3-1, 
MK1-2, and MK2-2, respectively and grown under the presence of IPTG for 3 hr or a overnight. 

Recombinant strains were harvested by centrifuging 1,000 ul of the cultured broth at 6,500 rpm, 10 min. 
The pellet was dissolved into a sample solution (50 mM Tris-HCI, pH6.8 containing 2% SDS, 5% 

40 mercaptoethanol, 10% glycerin, and 0.005% bromophenol blue) for SDS-polyacrylamide gel electrophoresis 
to a final volume of 0.2 ml. The sample solution was then boiled at 100 °C for 10 min. Ten ul of the boiled 
solution was loaded on 0.1% SDS-15% polyacrylamide gel (70 x 85 x 1 mm) together with a marker protein 
LMW Kit E (low-molecular weight marker protein, Pharmacia). Electrophoresis was carried out at a constant 
current of 30 mA for 45 min in Tris buffer (25 mM Tris, pH 8.3, 192 mM glycine, 0.1% SDS) as electrode 

45 buffer. Thereafter, DNA was transferred electophoretically to a nitrocellulose filter by superposing the gel 
onto a filter BA-83 (S & S), impressing a constant current of 120 mA for about 20 min between gel 
(cathode) and the filter (anode) as conventionally. 

The transcribed filter was cut to separate a region containing a marker protein (marker filter) and that 
containing the sample (sample filter) and the former was stained with 0.1% (w/v) amideblack 10B and the 

so latter immersed into 0.01 M PBS (pH 7.4) containing 5% (w/v) bovine serum albumin (BSA). A serum from a 
patient suffering from hepatitis C was diluted 50 times with 0.01 M PBS (pH 7.4) containing 5% (w/v) BSA. 
To the sample filter was added 10 ul of diluted serum and the filter allowed to stand for 2 hr at room 
temperature. Thereafter, the filter was wash d with PBS containing 0.1% (v/v) Tween 20 for 20 min (x3). 
The sample filter was then reacted with 10 ml of horseradish peroxidase conjugated anti human IgG 

55 (Gappel) at 37 °C for 1 hr and washed with PBS containing 0.1% (v/v) Tween 20 for 20 min (x3). The filter 
was then immersed into peroxidase-color-producing solution (60 mg 4-chloro-1-naphthol, 20 ml methanol, 
80 ml PBS, and 20 ul aqueous hydrogen peroxide). The colored filter was washed with distilled water and 
compared with the marker filter to demonstrate that the product d veloped color had a reasonable 
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molecular weight as an expression product of HCV gene contained in expression plasmid pC2HNS2-1 ( 
pC2HNS3-1. pCZMKI or pC2MK2 t and was identified as HC-associated antigens. 

The expression product from host cells transformed with plasmid pOFANS2-1, pOFANS3-1, pOFAMKI 
or pOFAMK2 is a fused prot in consisting of HCV originated polyp ptide and OmpF signal peptide of E.coli 
wherein the latter two attached at the N-terminus of the former. The expression product from host cells 
transformed with plasmid pGEXHNS2-1, pGEXHNS3-1, pGEXMKI or pGEXMK2 is also a fused protein 
consisting of HCV originated polypeptide and GST and thrombin cleaving site wherein the latter attached at 
the N-terminus of the former. These fused proteins have reasonable molecular weight and was also 
identified as HC-associated antigens. 

Example 21 

Synthesis of cDNA 



In the present Example 21 , water used is ultra-pure water which was preparaed by autoclaving (x 2) 
distilled water. 

[1] Preparation of RNA Sample Solution 

RNA isolated in Example 1 was dried, dissolved into 0.3 M (pH 7.0) sodium acetate, treated with 
phenol/chloroform (x 1), mixed with 2.5 volumes of ethanol, and centrifuged (15000 rpm, 20 min, at room 
temperature) with a rotor of about 5 cm in diameter to yield a pellet of nucleic acid. The pellet was then 
dried and dissolved into 30 ul of water containing 10 ul of ribonuclease inhibitor (100 U/ul. Takara Shuzo, 
Japan) to give a nucleic acid solution, which was then subjected to the cDNA synthesis. 

[2] Synthesis of cDNA Using Antisense Primer 



To 2 ul of RNA sample solution prepared in above [1] was added 1 ul of 15 pmol/ul anti-sense primer 
selected from a group of synthesized primers MS126, MS119, MS161, MS162, MS121, and MS163 shown 
in Table. 2 ul of 10 x RT buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI), 4 ul of 25 mM MgCI 2 , 8 ul of 2.5 
mM 4dNTPs, 1 ul of water and the mixture incubated at 65 - 70 °C for 5 min then at room temperature for 
5 min. To the mixture was added 1 ul of reverse transcriptase (25 U, Life Science), 1 ul of ribonuclease 
inhibitor (100 U/ul, Takara Shuzo) and the mixture incubated at 37 °C for 20 min, at 42 °C for 30 min, and 
finally at 95 °C for 2 min, which was followed by an immediate cooling to 0 °C (synthesis of cDNA). 

Amplification of DNA containing specific sequences was conducted by PCR (Saiki et a!., Nature 324: 
126 (1986)). Thus, 100 ul mixture containing ten ul of cDNA solution, 10 ul of 10 x PCR buffer (100 mM 
Tris-HCI (pH8.3), 500 mM KCI, 15 mM MgCfe, 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 2 ul of 15 pmol/ul 
synthetic DNA primer (the same primer as used in the synthesis of cDNA), 3 ul of 15 pmol/ul synthetic 
DNA primer (a counterpart of pair of primers, i.e.,MS127-MSl26, MS118-MS119, MS159-MS161, MS160- 
MS162, MS120-MS163, or MS120-MS121) and water was incubated at 95 °C for 5 min, then immediately 
cooled to 0 °C. One minute later, it was mixed with 0.5 ul of Taq DNA polymerase (7 U/ul, AmpliTaq™ 
Takara Shuzo) and overlaid with mineral oil. The resultant sample was then subjected to PCR. PCR was 
conducted by repeating 25 times of reaction cycle, which comprises the following treatments: at 95 °C for 1 
min; at 40 - 55 °C for 1 min; and at 72 °C for 1 - 5 min in DNA Thermal Cycler (Parkin Elmer Cetus). 
Finally, the reaction mixture was incubated at 72 °C for 7 min, which was followed by phenol/chloroform 
extraction and ethanol precipitation to obtain different amplified DNA fragments derived from either of 
above-mentioned pairs of primers. The ethanol precipitation was carried out by adding 2.5 volumes of 
ethanol and either of about 1/10 volume of 3 M acetic acid or an equal volume of 4 M ammonium acetate to 
the aqueous solution, mixing, centrifuging at 15,000 rpm for 15 min using a rotor of about 5 cm in diameter 
under cooling at 4 °C to pellet the precipitates, and drying the pellet. 

Example 22 



Cloning and Sequencing of Amplified DNA Fragments 

The cloning was carried out substantial in accordance with the method of Molecular Cloning, Cold 
Spring Harbor (1982). 

Dried DNA fragment (at least 1 pmole) obtained in the above Example 21 , [2] was blunt-ended with T4 
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DNA polymerase (Toyobo) and S'-end phosphorylated with polynucleotide kinase (Toyobo) and ligated into 
smal site of multi-cloning sites of 5 ng to 10 ng of pUC19 cloning v ctor. The cloning vector had b en 
previously treated as follows: digestion with a restriction enzyme Smal (Toyobo), phenol/chloroform 
extraction, ethanol pr cipitation, 5'-end dephosphorylation with alkalin phosphatas (Behringer-Manheim) 

5 (Molecular Cloning (1982) Cold Spring Harbor), ph nol/chloroform extraction, and ethanol-precipitation. The 
ligated DNA was used to transfect into a competent E.coli JM 109 or DH5 cells (Toyobo). The transfection 
was carried out according to the protocol of the manufacture's instruction (COMPETENT HIGH, Toyobo). 
Plasmid clones were recovered from transformed cells conventionally. At least 20 transformants were 
obtained using pUC1 9 cloning vectors containing either of DNA fragments obtained using either of pairs of 

10 primers in the same manner as that described in Example 21 , [2]. 

Plasmid DNA was isolated from corresponding transformant by an usual method and sequenced. The 
determination of base sequence was conducted by means of Fluorescent DNA Sequencer (GENESIS 2000, 
Dupont) using, as sequence primer, the following synthetic primers: 
5' d(GTAAAACGACGGCCAGT)3' (SEQ ID NO 143) and 

75 5'd(CAGGAAACAGCTATGAC)3' (SEQ ID NO 144) for the + and - strands of DNA fragment to be 
sequenced. When the DNA fragment is longer than about 200 bp, the determination was conducted by 
subcloning said DNA into a clone of dilation mutant in order to make sure the sequencing. 

DNA fragment obtained using either of pairs of primers shown in Example 21 and whose base 
sequence was determined is listed below. 

20 



Pair of primers 


clone(s) 


MS127-MS126 
MS118-MS119 
MS159-MS161 
MS160-MS162 

MS120-MS163 
MS120-MS121 


N22-1, N22-3, N22-4, H22-3, H22-8, H22-9 
N17-1, N17-2, N17-3, H17-1, H17-3 
028-1 , 028-2, 028-4 
N29-1, N29-2, N29-3 
O30-2, 030-3, O30-4 

N18-2, N18-3, N18-4, HI 8-1, H18-2, H18-3 



The alphabet letter used to express each clone represents the serum of HC patient used in Example 1. 
The base sequence of clones proved to have a homology with a known base sequence of HCV gene. The 
region on HCV gene corresponding to each clone was designated as follows. 



Pair of primers 


region on HCV gene 


MS127-MS126 


N22 


MS118-MS119 


N17 


MS159-MS161 


028 


MS160-MS162 


N29 


MS120-MS163 


030 


MS120-MS121 


N18 



Among resultant clones, base and amino acid sequences of clones N22-1, N17-3, 028-1, N29-1, N18-4, 
45 O30-3 are shown in SEQ ID NO 76, 81 , 86, 89, 92, and 98, respectively. Base sequences of other clones 
obtained in the same manner are listed below in alignment with a base sequence of a clone which 
disclosed in Seq. Lis. In the list, the base sequence of a clone disclosed in the Seq. Lis. is given at the 
uppermost column, which is follwed by others in the same region, showing only the bases which are 
different from those of the clone to be referred to (that shown in the uppermost column). The figure 
so following the name of clone represents the nucleotide number of the base at 5' terminus of the sequence. 
The nucleotide is numbered from 5' terminus (base No. 1 ) conventionally. 



55 
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BASE SEQUENCE OF EACH CLONE IN N22 REGION 

N22-1.NUC 1 : GGCATGTGGGCCCAGGGGAGGGGGCTGTGCAGTGGATGAACCGGCTGATA 

5 (SEQ ID NO: 76) 

N22-3.NUC 1 : 

(SEQ ID NO: 77) 

™ H22-3.NUC 1 : 

(SEQ ID NO: 78) 

H22-8.NUC 1 : 

*5 (SEQ ID NO: 79) 

H22-9.NUC 1 : 

(SEQ ID NO: 80) 



20 


N22-1.NUC 


51 


: GCGTTTGCTTCGCGGGGCAACCATGTCTCCCCCACGCACTATGTGCCTGA 




N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 
N22-1.NUC 


51 
51 
51 

51 : 

101 ! 












25 












: AAGCGACGCCGCAGCGCGCGTCACCCAGATCCTCTCCAACCTTACCATCA 


30 


N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9-NUC 
N22-1.NUC 


101 ; 
101 j 
101 i 
101 j 
151 s 






















35 


: CT^GCTGTTGAAGAGGCTTCACCAGTGGATTAATGAGGACTGCTCCACG 




N22-3.NUC 
H22-3.NUC 
H22-8.NUC 


151 j 
151 : 
151 : 






40 











45 



50 
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H22-9.NUC 151 

N22-1.NUC 201 

N22-3.NUC 201 

H22-3.NUC 201 

H22-8.NUC 201 

H22-9.NUC 201 

N22-1.NUC 251 

N22-3.NUC 251 

H22-3.NUC 251 

H22-8.NUC 251 

H22-9.NUC 251 

N22-1.NUC 301 

N22-3.NUC 301 

H22-3.NUC 301 

H22-8.NUC 301 

H22-9.NUC 301 

N22-1.NUC 351 

N22-3.NUC 351 

H22-3.NUC 351 

H22-8.NUC 351 

H22-9.NUC 351 

N22-1.NUC 401 

N22-3.NUC 401 

H22-3.NUC 401 

H22-8.NUC 401 



C T 

CCATGCTCCGGCTCGTGGCTCAGGGATGTTTGGGACTGGATATGC^ 

T* «T 

T» »T 

T» «T« «T • 

ATOX^CTGATTGCAAGACCTGGCTCCAGTCCAAGCTCCT 

T i 

G-"AG^«C*T C*" 

G**«AG---OT ♦ C«*. 

G'«"AG**'C*T C*«« 

CGGGGGTCCCTTTTTTCTCATGCCAGCGTGGGTACAAGGGGGTTTGGCGG 

C • 

••••A CC A A**C 

••••A COT A A^C 

••••A CC A • A--C 

GGAGATGGCATCATGTATACCACCTGCCCATGTGGAGCACAAATCACCGG 

C«"«* • 

C*G C G* * • • 

C-A C •..•G«.-« 

C Gv* 

ACATGTCAAAAACGGTTCTATGAGGATCGTTGGGCCTAGAACCTGTAGCA 

T 

♦ T AC»*«C*«C 

T C AC* • *C"C 
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w 



15 



20 



25 



30 



35 



40 



H22-9.NUC 
N22-1.NUC 
N22-3*NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9-NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 
N22-1.NUC 
N22-3-NUC 
H22-3.NUC 
H22-8.NUC 



401 
451 
451 
451 
451 
451 
501 
501 
501 
501 
501 
551 
551 
551 
551 
551 
601 
601 
601 
601 
601 
651 
651 
651 
651 



ACUVCGTGGCACGGAACATTTCCCATCAACGCGTACACCACAGGCCCCO?GC 















ACACCCTCCCCGGCGCGAAACTATTCCAGGGCGTTGTGGCGGGITGGCCAT 

T--A G C A A»"GC 

A G C • A A««TGC 

• A G T A--TGC 

A G T*A • ♦ «A« »TGC 

TGAGGAGTATGTGGAGGTCACGCGGGTGGGGGATTTCCACTACGTGACGG 



GCATGACCACTGACAACGTGAAATGCCCATGCCAGGTTCCGGCCCCCGAA 

A-* • 

T C 

T C 

T *C 

TTCTTCACAGAATOGGATGGGGTGCGGCTGCACAGGTACGCTCCGGCGTG 

G 

G* *G A A 

G-*G A A A 
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H22-9.NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9-NUC 
N22-1.NUC 
N22-3.NUC 
H22-3.NUC 
H22-8.NUC 
H22-9.NUC 



651 
701 
701 
701 
701 
701 
751 
751 
751 
751 
751 
801 
801 
801 
801 
801 



T 



CAAACCTCTCCTGCGGGATGAGGTCACATTCCAGGTCGGGCTCAACC^ 



ATACGGTTGGGTCACAGCTCCCATGTGAGCCCGAACCGGATGTAACAGTG 



TCC 



TCC 



TCC 



GTCACCTCCATGCTCACC 



BASE SEQUENCE OF EACH CIiONE IN N17 REGION 



N17-3-NUC 



1 z TGTGAGCCCGAACCGGATGTAACAGTGGTCACCTCCATGCTCACCGACCC 



(SEQ ID NO: 81) 

N17-1.NUC 
(SEQ ID NO: 82) 



N17-2.NUC 



(SEQ ID NO: 83) 
H17-1.NUC 



1 : 



(SEQ ID NO: 84) 
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H17-3.NUC 



1 : 



w 



15 



20 



25 



30 



35 



40 



(SEQ ID NO: 85) 
N17-3.NUC 51 



N17-1-NUC 

N17-2.NUC 

H17-1.NUC 

H17-3«NUC 

N17-3.NUC 

N17-1.NUC 

N17-2.NUC 

H17-1.NUC 

H17-3.KUC 

N17-3.NUC 

N17-1.NUC 

N17-2.NUC 

H17-1-NUC 

H17-3.NUC 

N17-3.NUC 

N17-1.NUC 

N17-2.NUC 

H17-1-NUC 

H17-3,NUC 

N17-3.NUC 

N17-1.NUC 

N17-2.NUC 



51 
51 
51 
51 
101 
101 
101 
101 
101 
151 
151 
151 
151 
151 
201 
201 
201 
201 
201 
251 
251 
251 



CTCCCACATTACAGCAGAGGCGGCTAGGCGTAGGCTGACCAGAGGGTCTC 



CCCCTTCCTCGACC^TTCTTCAGCTAGTCAGTTGTCTGCGCOTTCTTCG 



T»T«G 



T*G 



T«G 



CA 



CA 



CO-CCT- 



CC 



CT 



CAGGCAACATGCACTACCCATCAGGGCGCCCCAGACACTGACCTCATCGA 

A« «G T A*A*T G 

A* • • *G T-A*T G • 

A- ♦ ♦ -G T*A«T« • --G* • *G 

A« * • *G T«A«T« • • *G* • *G 

GGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGAAACATCACCCGCGTGG 



A-«G 



A~G 



AGTCAGAGAACAAGATAGTAATTCTAGACTCTTTTGAACCGCTTCGAGCG 
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H17-1.NUC 251 : G G C-*C 

H17-3.NUC 251 : G^**G G C**C 

N17-3.NUC 301 : GAGGAGGATGA 

N17-1.NUC 301 : 

N17-2.NUC 301 : 



10 


H17-1.NUC 
H17-3.NUC 


301 : 
301 : 






BASE SEQUENCE 


. OF EACH 


CLONE IN 028 REGION 


15 


028-1. NUC 
(SEQ ID NO: 


1 : 

86) 


GTGGTAGTCCTGGACTCGTTGGAGCCGCTTCAAGCGAAGGAAGGTGAGAG 




028-2. NUC 


1 : 




20 


(SEQ ID NO: 


87) 






02 8-4. NUC 


1 : 






(SEQ ID NO: 


88) 




25 


02 8-1. NUC 


51 : 


GGAAGTGTCCGTTGCGGCGGAGATCCTGCGGAAGACCAGGAAATTCCCCG 




028-2 .NUC 


51 : 






028-4. NUC 


51 : 




30 


02 8-1. NUC 
028-2. NUC 
02 8-4. NUC 


101 : 
101 : 
101 : 


CAGCGATGCC CGTATGGGCACGC CCGGACTACAACCCACCATTACTAGAG 


35 


028-1. NUC 


151 : 


TCTTGGAAGAACCCGGACTACGTCCCTCCAGTGGTACACGGGTGCCCATT 




028-2. NUC 


151 : 






028-4. NUC 


151 : 




40 


028-1. NUC 


201 : 


GCCGCCTACCAAGGCCCCTCCAATACCACCTCCACGAAGAAAGAGAACGG 




028-2. NUC 


201 : 
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nofl-A NTIP 


9fl1 
ZUJL • 




u/o"l • IiUL 


9^1 


X xvj ILL XuaLAuAAI LL Xv>Lb X \» ILL J^L lutL ± Aouv.uu'wj^ x a jljtvv— n. 


UjSO — z • NUC 


*sdx : 










098— 1 NTIP 


301 s 

•J u JW . 


AAGACCTTTGGCAGTTCCGG^ 




JUi ■ 




*JjtO ~ « 


O \f JL . 






K1 , 
JJ J. * 


» cnxzrc p tp p app & fjfip ptp p*ip m a aha ar atgtj^ggatCCGACGCTG 

LVJVJLLL X LL JL\JJTtLLM\g\JLL X^L\^LU*UlUWw/lA\JVnwwiJL\*\*griMww ^ 


028-2 .NUC 


351 : 




UZo-** •NUC 


JDl : 




QZo—X .NUC 


4UX 


* /"*m/*/" , m » ^riv* r*m/'^/^ n mf*f*f*r*r*f*f*f*rw*tnf* a fW* & filMiPPfiflififi/ZAPPP'PfiAT 
! AljXC(jTACi\^C TCCAXajCCCCCCCX^^ * 


028-2 .NUC 


401 : 




U^o— 4 . NUC 


4UX 




Uzo-1 • NUC 


4DX 


I C T C AviL uAL (jVj t» 1 C X XajxLj X L iALCVj iiwuLuAwwWsuLUluLWiwft^u a 


028-2 . NUC 


451 ' 




UZO-n • NUC 






U*0— 1 . NUC 


CA1 

DUX 


» /-»/^rn/~*cTv^r»m<^'/^rnr«/^ * ov oi/T»m* n Kr > ^aw?&rA^/2P^Pr"T , P& ATTAPAPPAT 
S Hj X L XajL XuL X Ll?A iu X L L X ALnLjfLXAA3jnLA\9\3rLVVLL Jl i«w»ww» * 


028-2 .NUC 


501 




.NUC 


DUX 




028-1 .NUC 


DDI 


: viCvi L C(jC WjAkjtiAoAL»C^AAtjC TvjCC CA X X AAlijLw, iuftUUUiLLb x x x u 


028-2. NUC 


551 




028-4. NUC 


551 




028-1. NUC 


601 


: CTGCGCC^CC^CAAC^TGGTCT^ 


028-2.NUC 


601 




028-4. NUC 


601 
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70 



75 



028-1. NUC 651 

028-2. NUC 651 

028-4. NUC 651 

028-1. NUC 701 

028-2. NUC 701 

028-4. NUC 701 
BASE SEQUENCE OF EACH CLONE IN N29 REGION 



GCGGCAGAAAAAGGTCACATTTGACAGACTGCJVAGTC 



ACCGGGACGTGCTCAAGGACATGAAGGCCAAGGCGTCCAC 



N29-1.NUC 
(SEQ ID NO: 89) 



1 : ACTACCGGGACGTGCTGAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAG 





N29-2.NUC 


1 




(SEQ ID NO: 


90) 


20 


N29-3.NUC 


1 




(SEQ ID NO: 


91) 




N29-1.NUC 


51 


25 


N29-2.NUC 


51 




N29-3.NUC 


51 




N29-1.NUC 


101 


30 


N29-2.NUC 


101 




N29-3.NUC 


101 




N29-1.NUC 


151 


35 


N29-2.NUC 


151 




N29-3.NUC 


151 




N29-1.NUC 


201 


40 


N29-2.NUC 


201 




N29-3.NUC 


201 



51 : GCTAAACTTCTATCTGTAGAGGAAGCCTGCAAGCTGACGCCCCCACACTC 

T 

T 

GGCCAGATCTAAATTTGGCTACGGGGCAAAGGACGTCCGGAGCCTGTCCA 



GCAAGGCCGTOAACCACATCCGCTCCGTGTGGAAGGACTTGCTGGAAGAC 



ACTGAGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTG 



45 



50 



55 



57 
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N29-1-NUC 


251 


• 


N29-2.NUC 


251 


• 
• 


N29-3.NUC 


251 


• 
« 


N29-1-NUC 


301 


• 
• 


N29-2.NUC 


301 


• 
• 


N29-3.NUC 


301 


• 
* 


N29-1-NUC 


351 


m 
« 


N29-2.NUC 


351 


• 
• 


N29-3-NUC 


351 


• 

9 


N29-1.NUC 


401 


m 
• 


N29-2*NUC 


401 


* 
* 


N29-3.KUC 


401 


* 
• 


N29-1«NUC 


451 


* 
* 


N29-2.NUC 


451 


* 
• 


N29-3.NUC 


451 




N29-1 .NUC 


501 


m 
m 


N29-2.NUC 


501 


• 
• 


N29-3.NUC 


501 


• 
* 


SE SEQUENCE 


OF EACH 


O30-3.NUC 


1 


• 



(SEQ ID NO: 98) 

O30-2.NUC 1 : 

(SEQ ID NO: 99) 

O30-4.NUC 1 : 

(SEQ ID NO: 100) 
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SI • 


a AC ATP rCZTGTCG ACUZ AfiTP A ATTT AP P A A TCTTGTGAjCTTGGCCCCCGA 




Sfl • 












101 2 
X U X ■ 


GG PP AG A P AGGP P AT A AGGTPAP TPAPAGAGC GGC TTTACATC GGGGGC C 




XUU 3 




u ju— ** * nuL- 


i ni . 




UjU-J • MUL 




pppmp upmj xmmp * » AfiT!/2PP AP A AP1V2Pnn , l ,, PA , l , PniPPdni'W3PPGCGTC 


OoU— z . NUC 


n en . 
XjU * 




030—4 • NUC 


1 CI . 

xOl ! 




OJU— O . NUC 


Z U J. s 


i f^ppr!OPnf piv^ k pp & p»p & pPTTTPrflni nrpnPPPiV IkP ATY^TTAPTTGA A 


030-2 .NUC 


o Art 

200 i 




030— 4 .NUC 






O JU— 3 « NUC 


zDx : 


► fippp'ppTrpikppp'iviTr fcPPTWh » ar.p*pppn^/7aPTY2P APG ATGPTTG 
» w w w Itl bUAbLL Xwr X L-vjAvjC IbWiAAurL Xw wAVSwvn-w xaj wAwUaxatw xxvj 


030-2 .NUC 


OCA 




UOU — 4 . nUU 










• TGTGPGfi AG APG AP P TTGTPGTTATPTGTGATAGCGCGGGAACTGAGGAG 


UjU-Z • NUC 






rton a http 

U JU- 4 . MUU 






Aon *3 xtttp 
UJU-j • NUC 




• /^AP/iPfif2P^!.Af2PP r PAPOAf2*PP'P*PP APfZTSAfiGPTATGAPTAGGTACTCTGC 

» VVAw V»w Out UAU w w X Aw ljrAV> Xw X X l^A^.\3UAvJUV. AAlUAVf 1AVJU XXVw X w X ww 


O30-2 .NUC 


OCA 

Sou 




030-4. NUC 


351 




O30-3.NUC 


401 


i CCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGAGCTwATAACAT 


030-2. NUC 


. 400 




O30-4.NUC 


401 




030-3. NUC 


451 


: CATGTTCCTCCAATCTGTCGGTC 
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O30-2 . NUC 


450 : 






451 : 




n^fi— ^ HTTP 


■ifil « 


TAPTATPTPAPPPGTGAPPPPAPPAPPPPCPTAGCGCGGGCTGCGTGGGA 

IflU XAX^l WAV^UwlUAvvUvil\fUlwUV<UUKf X X1>\7V*\7** wW%* * www********* 


Ain 0 MTTP 


jUU * 




r*^ri— A mtip 


J VI X 4 




nift— mtip 




' GAP AGPT AGAP AP APTPP AGTP AAPTPPTGGCTAGGCAACATCATCATGT 




J JU « 






551 : 

J .J X « 






DUX « 


- APGPGPP PAPP TTATGGGPAAGGAlVlATTrPx^ 








O^fl — A NT1P 


601 < 

O U X i 




n^ft—^ MTTP 


SI - 


» AT^PT l TY^TAGPPPAf^Af^ AAPTTGA AAAA(^PPTAGATT^ 

> Ax\*L X IV X*lA3*«***«A*9wi7\w***^cMar X l\jnlUUUUJV<Uv XAWIX x w x a 




ccn 
O jU i 




*JJU — 4 « NUC 


ODl ! 




OoU— J . NUC 


/ Ul 


; CWjViVjCCAC 1 1 AC JL CCAJL XAjAVjCCAC 1 l\sACC Xv^UxAlWil X UnAU 


O30— 2 . NUC 


700 5 




* NUC 


/ UJL 




n^fi— "X MT1P 


7 SI 

( J JL 


• G APTPP APGGTPTTAGPGPATTTTPAPTPPATAGTTACTCTC(^^ 

» \J^*Var X WXwwV? X ** X X**W wWX ^ X X ^/IV>> X w w*VAX*V* * ***** * v* ■*■ **> *— * ».*_»^» a w» «u 


UJU-Z . NUC 


TEA 

/ou 




uj v"*Tt • revive 


7S1 




O30-3.NUC 


801 


: ATCaATAGGGTGGCxTCATCC^ 


O30-2.NUC 


800 




O30-4.NUC 


801 




O30-3.NUC 


851 


: AGTCTGGAGACATCGGGCCAGAAGCGTCCGCGCTAAGCTACTGTCCCAGG 


O30-2.NUC 


850 
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O30-4.NUC 


851 : 


O30-3.NUC 


901 : 


O30-2.NUC 


f\ ^\ A 

900 


O30-4.NUC 


901 


O30-3.NUC 


951 


O30-2.NUC 


a A. 

950 


O30-4.NUC 


951 


O30-3.NUC 


1001 


O30-2.NUC 


1000 


O30-4.NUC 


1001 


O30-3.NUC 


1051 


O30-2-NUC 


1050 


O30-4.NUC 


1051 


O30-3.NUC 


1101 


O30-2.NUC 


1100 


O30-4.NUC 


1101 


O30-3.NUC 


1151 


O30-2.NUC 


1150 


O30-4.NUC 


1151 


BASE SEQUENCE 


OF EA< 


N18-4.NUC 


1 


(SEQ ID NO: 


92) 


N18-2.NUC 


1 


(SEQ ID NO: 


93) 



GGGGGAGGGCCGCCACCTGTGGCAAATAC C TCTTCAACTGGGCAGTAAAG 



AC 



C^GCTG^CTC^CTCC^^ 



CGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTO 



CT 



CGTGCCCGACCCCGCTGGTTCATGTGGTGCCTACTCCTACTTTCCGTA 



GGGGTAl 



GGCATCTACCTGCTCCCCAACCGATGAGCGGGGAGCTAAACACT 



CCAGGCCAATAGGCCATCCCC 



N18-3.NUC 



1 : 
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10 



15 



20 



25 



30 



35 



40 



(SEQ ID NO: 94) 

H18-1.NUC 1 : 

(SEQ ID NO: 95) 

H18-2.NUC 1 : 

(SEQ ID NO: 96) 

H18-3-NUC 1 : 

(SEQ ID NO: 97) 
N18-4.NUC 51 
N18-2.NUC 51 
N18-3.NUC 51 
H18-1.NUC 51 
H18-2.NUC 51 
H18-3.NUC 51 
N18-4.NUC 101 
N18-2.NUC 101 
N18-3-NUC 101 
H18-1.NUC 101 
H18-2.NUC 101 
H18-3.NUC 101 
N18-4.NUC 151 
N18-2.NUC 151 
N18-3.NUC 151 
H18-1.NUC 151 
H18-2.NUC 151 
H18-3.NUC 151 



GACATC CGTACTGAGGAGTCAATTTATCAATGTTGTGACTTGGACC CCGA 




































GGCCAGACAGGCCATAAGGTCGCTCACAGAGCGGCTTTATATCGGGGGCC 



CCTTGACCAATTCAAAAGGGCAAAACTGCGGCTATCGCCGGTGCCGCGCC 



C 
C 
C 



T 



T- 



45 



50 



55 
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N18-4 .NUC 


201 




N18-2 .NUC 


201 


5 


N18-3.NUC 


201 




H18-1 .NUC 


201 




H18-2.NUC 


201 


70 


H18-3 .NUC 


201 




Nlfi-4 NUC 

X«XU » * 11U^ 


251 






251 

A J JU 


ID 


N1 ft— 1 NTIP 


^ «/ A 




Til ft— 1 NTTP 


^ jii 




TT1 ft — ? NTTP 


251 


20 


Ml Q *5 "HTTP 






N1 ft — A fJTTP 






N18-2 NUC 


301 

\J 


25 


N18-3 .NUC 


301 




H18-1 .NUC 


301 




H18-2 .NUC 


301 


30 


H18-3 . NUC 


301 




Nlfl-4 NUC 


351 




N18-2 .NUC 

A-l <Jk> V A* * A 1 ! W %rf 


351 


35 


N18-3.NUC 


351 




H18-1.NUC 


351 




H18-2.NUC 


351 


40 


H18-3.NUC 


351 




N18-4.NUC 


401 



AGCGGCGTGCTGACGACTAGCTGCGGTAATACCCTCACATGTTACTTGAA 

C T^T • 

♦.. *C T . • • 

* C T 

GGCCTCTGCAGCCTCTCGAGCiraGAAGCTC 

A A* 

A A- 

A A 

TGTGCGGAGACGAC CTTGTCGTTATCTGTGAAAGCGCGGGAACCCAGGAG 
G 

G C G 

G C 

G C 

GACGCGGCAAACCTACGAGTCTTCACGGAGGCTATGACCJVGGAATTCC^ 

G • 

G • 

G 

• • G • 

C 



50 



N18- 


2. NUC 


401 


• 


N18- 


3. NUC 


401 


* 
• 


H18- 


l.NUC 


401 


• 


H18- 


2 .NUC 


401 


* 
• 


H18- 


3. NUC 


401 


• 



55 Bas sequences of clones in each of six regions are summarized in SEQ ID NO 64 to 69. Base 
sequences of SEQ ID NO 64 to 69, 76 - 100 show the base sequences of + - strand of DNA fragments 
which were derived from HCV gene and inserted into each plasmid used for the transformation. These 
clones are double stranded DNA. Plasmids used for the sequencing of clones were designated by adding a 
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prefix n pUC rt to the name of each clone, for example, plasmid used for sequencing the clone N22-1 was 
designated as plasmid pUCN22-1 . Each plasmid contained one DNA molecule. 

These base sequences represents those of clones obtained by cloning the cDNA synthesized from 
RNA isolated from serum of patient(s) suffering from HC. Therefore, these sequences are specific for clones 

5 originated from serum of HCV-infected patients but can not be found or obtained from serum of healthy 
subjects. Thus, cDNA prepared from RNA (if there are any) obtained from a healthy subject under more 
strict conditions, for instance, by increasing (3 or 4 folds) the reaction cycles of PCR in Example 21 [2], by 
repeating them 60- 100 times, did not show any homology in base sequence with those shown in SEQ ID 
NO 64 - 69, and 76- 100. Consequently, base sequences of clones shown in SEQ ID NO 64 - 69, and 76 - 

70 100 are specific for those obtained from serum of HC patient. 

The above table indicates that there must be more than one virus in a patient. 

Example 23 
75 Preparation of Clone 1530U 

[1] Preparation of Clones 1728, 2217, and 2918 

Clones N17-3 and 028-1 were ligated using overlapping region by PCR. One ul (about 0.5 to 1 ug/ul) 

20 of each DNA fragment from clones N17-3 and 028-1 (311 and 740 bp, respectively) was added into a 
reaction mixture containing 10 ul of 10 x PCR buffer (100 mM Tris-HCI (pH8.3), 500 mM KCI, 15 mM 
MgCfe. 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 ul each of 20 pmol/ul synthetic primers MS118 and 
MS161, and 76.5 ul of water. After an intimate mixing, the mixture was heated at 95 °C for 5 min, then 
immediately cooled to 0 °C. One minute later, to the mixture was added 0.5 ul of Taq DNA polymerase (7 

25 U/ul, AmpliTaq™ Takara Shuzo), mixed and overlaid with mineral oil. The sample was then subjected to 
PCR. PCR was conducted by repeating 25 times of reaction cycle, which comprises the following 
treatments: at 95 °C for 1 min; at 37 °C for 1 min; and at 72 °C for 2 min in DNA Thermal Cycler (Parkin 
Elmer Cetus). It was followed by an incubation at 97 °C for 2 min. The mixture was immediately cooled to 0 
°C, kept at the same temperature for 2 min, mixed with 0.5 ul of Taq DNA polymerase (7 U/ul. AmpliTaq™ 

30 Takara Shuzo), and overlaid with mineral oil. The sample was then treated in the same manner as the 
above by repeating 25 times of reaction cycle, which comprises the following treatments: at 95 °Cfor 1 min; 
at 55 °C for 1 min; and at 72 °C for 2 min. After the final treatment at 72 °C for 7 min, the resultant reaction 
solution was treated with phenol/chloroform then precipitated with ethanol. The two DNA fragments were 
ligated and amplified by PCR. The ligated DNA sample was fractionated on agarose gel electrophoresis and 

35 a gel containing about 1000 bp fragment was excised from the gel (Molecular Cloning (1982) Cold Spring 
Harbor). The resultant DNA fragment was then modified as described in Example 22 and ligated into Smal 
site of multi-cloning sites of pUC19, cloned and screened as described in Example 22 to obtain plasmid 
pUC1728. The resultant clone derived from serum of HC patient was designated as clone 1728 and whose 
base sequence is given in SEQ ID NO 8. 

40 In the same manner as the above, plasmid pUC2217 was obtained from clones N22-1 and N17-3, which 
plasmid contains at Smal site a DNA fragment derived from serum of HC patient in the following order from 
5' to 3' site: EcoRI restriction site from pUC19, DNA from clone N22-1, DNA from clone N17-3, and Hindlll 
restriction siteTBase and amino acid sequences of clone 2217 are given in SEQ ID NO 70. 

In the same manner as the above, clone 2918 was obtained from clones N29-1 and N18-4 whose base 

45 and amino acid sequences are given in SEQ ID NO 72. 

[2] Preparation of Clone 1718 

There is a 43 bp sequence common to clones 1728 and 2918. These fragments of 1004 and 857 bp 
so were ligated by PCR substantial in accordance with the procedures as those described in the above [1] 
except that the elongation step in PCR reaction using Taq polymerase was conducted at 72 °C for 5 min. 
The resultant plasmid pUC1718 contained a DNA fragment having a base sequence derived from HCV 
gene at Smal site in which EcoRI site of pUC19 is located to th 5' sit of clone N17-3. (N17 region is 
located tolTsite of N18 regiorTon HCV gene). Base and amino acid sequences of clone 1718 is given in 
55 SEQ ID NO 73. 

[3] Preparation of Clone 2218 
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Overlapping clones 2217 and 2218 were ligated by taking advantage of unique restriction site which 
exists in the overlapping regions of the both clones. Upon digestion with restriction enzym Xbal, pUC2217 
was cleaved at two sites, i.e., in a sequence from clone N17-3 and the other in a sequence from pUC19, 
and a small fragment of less than about 40 bp and a large fragment containing most of the sequences from 

5 vector and clone 2217 were separated on agarose gel electrophoresis and the larger fragment, 
pUC2217/Xbal, was extracted. Plasmid pUCl718 was also cleaved at two sites within a sequence from 
clone NITTand one from pUC19, and a larger DNA fragment 1718/Xbal of about 1545 bp containing most 
of the sequences from vector and clone 1718 was separated on agarose gel electrophoresis and extracted. 
The ligation of clones 2217 and 1718 was accomplished on the basis of an assumption that plasmids 

70 pUC2217 and pl)C1718 contain each DNA fragment in the same orientation. Thus, 10 ng of pUC221 7/Xbal 
and 50 ng of 1718/Xbal was ligated in the presence of T4DNA ligase and the ligation mixture was incubated 
with competent Exoii JM109 cells and cloned in the same manner as Example 22. Transformants 
containing plasmid~pUC2218 which contains clone 17-3 religated at Xbal site. The plasmid pUC2218 
contains at its Smal site, EcoRI site and the following regions without overlapping: clones N22-1, N17-3, 

75 028-1, N29-1, NlfM. Baseband amino acid sequences of the resultant clone 2218 is given in SEQ ID NO 
74. 

[4] Ligation of N15 Region and O30 Region Corresponding to 3' Terminal Region of HCV Gene 

Clone 030-3 is shown in SEQ ID NO 98. Plasmid pUCO30 contains a DNA fragment having a sequence 
corresponding to 3' terminal region of HCV gene at Smal site of pUC19 in the order of. from 5' to 3\ EcoRI 
site and clone O30-3. Plasmid pUCN15 contains a DNA fragment of HCV gene, clone N15, forwardly at 
Smal site of pUC19 in the order of, from 5* to 3\ EcoRI site and clone N15. 

Plasmid pUCO30 was cleaved at a cloning site, Sad. of pUC19 and blunt ended with T4 DNA 
polymerase conventionally, which was followed by the cleavage at another cloning site. Hindlll. of pUC19 to 
obtain a DNA fragment O30 (SacT4/Hind) derived from HCV gene. Plasmid pUCN15 was digested with 
Xbal, blunt ended, and digested with Hindlll to obtain a larger DNA fragment pUCN15 (XbaT4/Hind) which 
contains a sequence from clone N15-1 and all the region of Hindlll fragment of pUC19. About 80 ng of DNA 
fragment O30 (SacT4/Hind) and about 20 ng of DNA fragment pUCN15 (XbaT4/Hind) were ligated in the 
presence of T4DNA ligase in 20 ul of reaction mixture. The ligation mixture was incubated with COM- 
PETENT HIGH JM109 (Toyobo) according to the protocol provided by the manufacture and transformants 
containing desired plasmid pUC15-30 were isolated. Taking advantage of the fact that said plasmid pUC15- 
30 has only one site which can be cleaved by restriction enzymes Bglll and Hindlll, it was subjected to PCR 
using a primer MS174 having a Bglll site in sequence derived from clone 030-3. 

PCR was conducted using,~as a template. 10 ng of pUC1530 and primers MS174 and MS175. PCR 
fragment was then digested with Bglll and Hindlll and the resultant fragment ligated to a Bglll-Hindlll 
fragment containing the vector fragment of pUCO30 to obtain plasmid pUC15-30U having polyU attached to 
the 3' terminus of clone 30-3. 

40 [5] Ligation of N15 to O30 Regions 

There is an Apal site within a region common to N15 and N22 regions. There also is an Apal site within 
a region commonlo N18 and O30 regions. A DNA fragment isolated from pUC2218 with Apal was inserted 
into Apal site of pUC15-30 appropriately to obtain plasmid pUC1530U. Thus, plasmid pUC2218 was 

45 digested with Apal and 30 ng of desired DNA fragment, pUC2218/Apal, was isolated by agarose gel 
electrophoresis~conventionally. Plasmid pUC15-30 was digested with Apal completely and desired DNA 
fragment was isolated and dephosphorylated. Ligation was conducted using 30 ng of pUC221 8/Apal and 20 
ng of dephosphorylated DNA fragment in a final volume of 10 ul. All the ligation mixture was used to 
transform COMPETENT HIGH JM109 (Toyobo). From transformants, desired plasmid pUC1530U which 

so contains at the cloning site, Smal, a clone 1530U having a sequence from regions N15 to O30 without 
overlapping was prepared. Baseand amino acid sequences of clone 1530 were determined in the same 
manner as that used in Example 22 and shown in SEQ ID NO 75. 

The amino acid sequ nee of the ligated region comprising N15 to O30 regions has a high homology 
with a part of non-structural protein NS4 and NS5 of Flavivirus, a related strain of HCV. It was also 

55 confirmed that said region is homologous to a sequenc encoding a part of NS4 region and all of the NS5 
region by comparison with a known sequence of entire HCV gene disclosed by aforementioned Chiron, 
Shimotohno, or Takamizawa. As a conclusion, clones herein disclosed and whose sequence are shown in 
Seq Us correspond to a part of NS4 and all of NS5. As the next step, polypeptides encoded by said clone 
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was evaluated as to the ability to react immunologically with antiserum of HC patients. 
Example 24 

Modification of DNA for the Expression of HCV Polypeptide Encoded by Clone 1530U 

The expression of all or a part of regions of clone 1530U which encodes HCV polypeptide can be 
accomplished using any of methods which will be hereinafter described. 

[11 Modification of DNA for the Expression of a part of HCV Polypeptide Encoded by Clone 1530U in E.coli 

This method is used to express a desired polypeptide free from additional amino acid sequence. 

Clone 1530U appears to encode an ORF derived from HCV gene (hereinafter, referred to as NS5N) 
from No.1246 (C) to 1692 (C) of base sequence of SEQ ID NO 75, which can be expressed by inserting an 
ATG initiation codon at 5' site of said gene in frame. When a part of amino acid sequence of NS5N is 
desired to be expressed, ATG initiation codon and termination codon were inserted to 5' and 3* site of a 
gene encoding said amino acid sequence such that the frame of these codons are in confirmity with that of 
the gene. The insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene may be 
accompanied by an addition of a foreign polypeptide which is not encoded by HCV gene to the N' terminus 
(amino terminus) of an amino acid sequence of SEQ ID NO 75. This may happen when a sequence of 
pUC19 is inserted between ATG codon and DNA encoding HCV polypeptide at the time of insertion of ATG 
codon. The modification of DNA was carried out by PCR using the following synthetic DNAs as primer. 
5' primer: 

MSNS5-1: 5' GCAAGCTTATGCAGCGTGGGTACAAGGGGGTT 3' (SEQ ID NO 183) 
3' primer: 

MSNS5-2: 5' GCGAATTCAGATCTTCATCAGAGCTGTGACCCAACCGTATATTGGTT 3' (SEQ ID NO 184) 
The synthetic DNA was adjusted to 20 pmoi/ml before use. 

PCR was carried out according to Saiki's method in a total volume of 100 ul containing 100 ng of 
plasmid pUC2217 (or pUCN22-1 which contains the same region), and 2 ul each of the above 3* and 5' 
primers. The reaction mixture was heated at 95 °C for 5 min and quenched at 0 °C. One minute later, to the 
mixture was added 0.5 ul of Taq DNA polymerase (7 U/ml, AmpliTaq™ Takara Shuzo), mixed thoroughly 
and overlaid with mineral oil. The sample was reacted by repeating 25 cycles of treatments which 
comprises: at 95 °C for 1 minute; at 55 °C for 1 min; and at 72 °C for 1 min in DNA Thermal Cycler (Parkin 
Elmer Cetus). The resultant reaction solution was extracted with phenol/chloroform and precipitated with 
ethanol conventionally. The amplified DNA samples were digested with Hindlll and EcoRI and fractionated 
on acrylamide gel electrophoresis and the desired DNA fragment was extracted. The resultant DNA 
fragment was then ligated into Hindlll and Eco RI sites of a cloning vector pUC19, cloned and screened to 
obtain plasmid pUCNS5N, which was then sequenced. The clone NS5N has a modified base sequence of 
that from No.1246 (C) to 1692 (C) of SEQ ID NO 75, wherein, at the 5' site of said sequence, the following 
DNA fragment: 



5' GCAAGCTTATG 3' 

3' CGTTCGAATAC 5' (SEQ ID NO 155) 

which comprises a Hindlll restriction site followed by an initiation codon ATG was added, and, at the 3' site 
of said sequence, thefoilowing DNA fragment: 

5' TGATGAAGATCTGAATTCGC 3' 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 



which comprises two termination codons, Bglll and EcoRI sites from 5* to 3' was added. 
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[2] Modification of DNA for the Expression of HCV Polypeptide Encoded by MKCNS5 Region in Insect Cells 

MKCNS5 region is an ORF derived from HCV gene encoding an amino acid sequence from No. 415 to 
No. 1411 of SEQ ID NO 75. For the expression of polypeptide, an initiation codon ATG is insert d at 5' site 
of said gene in frame so that the expression of the gene might be properly eff cted in insect cells. The 
insertion of an ATG initiation codon at the upperstream from 5' terminus of said gene may be accompanied 
by an addition of a foreign polypeptide which is not encoded by HCV gene to the N' terminus (amino 
terminus) of all or a part of the amino acid sequence encoded by HCV gene. When an expression vector 
containing an initiation codon for insect cells is used, a DNA fragment from the clone is ligated to the vector 
such that the frame of said DNA is in confirmity with that of the initiation codon on said vector. It also can 
be accompanied by an addition of a foreign polypeptide which is not encoded by HCV gene to the N* 
terminus (amino terminus) of all or a part of the amino acid sequence encoded by HCV gene. Polypeptides 
encoded by MKCNS5 was expressed in insect cells as a single precursor polypeptide subject to that said 
polypeptide comprises, at least, the amino acid sequence from No. 415 to 1411 of SEQ ID NO 75, which 
precursor was then processed by, for example, glycosylate and accumulated intracellurarly. The modifica- 
tion of DNA of clone MKCNS5 region was carried out by PCR using the following synthetic DNA as primers. 
5* primers: 

MKCNS5-1: 5' GCGCTAGCATGGGGTACAAGGGGGTTTGGCGGG 3' (SEQ ID NO 185) 
3* primer: 

MKCNS5-2: 5' GCGCTAGCTCATCGGTTGGGGAGCAGGTAGAT 3* (SEQ ID NO 186) 

These primers were designed to introduce Nhel site at both ends of said gene in order to insert said 
gene into Nhel site of transfer vector pBlueBac (Invitrogen). Therefore, the use of these primers are not 
critical and"others can be used which are designed for introducing said gene into any other transfer vectors 
for insect cells. The above two synthetic DNAs were adjusted to 20 pmol/ml before use. 

The PCR was carried out using the same reaction solution and worked up in the same manner as 
described in the above [1] except that primers MKCNS5-1 and MKCNS5-2 and, as a template plasmid, 20 
ng of plasmid pUC1530U were used. PCR was accomplished by repeating 10 times of reaction cycles 
consisting of: 1 min at 95 °C; 1 min at 50 °C and 5 min at 72 °C ; and then 20 times of reaction cycles 
consisting of: 1 min at 95 °C; 1 min at 65 °C; 5 min at 72 °C to yield a desired 3013 bp DNA fragment. 

The DNA fragment was digested with Nhel, fractionated on acrylamide gel electrophoresis and a DNA 
fragment of desired length was extracted. The resultant DNA fragment was then ligated into Nhel site of a 
transfer vector pBlueBac (Invitrogen), cloned and screened for a clone which contains a single DNA 
fragment inserted at Nhel site to obtain plasmid pBlueMKCNS5. 

According to theleaching shown in the protocol given by Invitrogen, the expression unit of said plasmid 
contains DNA fragment derived from HCV gene oriented forward and ligated to the Nhel cloning site 
downstream from a poyhedrin promoter. 

Example 25 

Expression of HCV Polypeptides Encoded by Clones NS5N, MKCNS4bNS5 in E.coli 

Each clone encodes a part of polypeptide encoded by cDNA originated from serum of HC patient. The 
polypeptide encoded by each clone was expressed in E.coli, as it is, by subcloning said clone into an 
expression vector pCZ44 (Japanese Patent Publication (KOKAl) No. 124387/1989). 

A DNA fragment having a sequence of clone NS5N obtained in Example 24 was digested thoroughly 
with restriction enzymes Hindlll and Bgill, extracted with phenol/chloroform, precipitated with ethanol, 
separated on acrylamide geTelectrophoresis. From the gel was extracted a DNA fragment having cohesive 
Hindlll- and Bglll-restricted ends. The expression vector pCZ44 was digested with Hindlll and Bglil. The 
larger fragmenTcontaining a region functional for the expression of DNA was separated, treated in the same 
manner, ligated to the Hindlll-Bglll fragment obtained from a clone and cloned conventionally. The resultant 
plasmid was designatedas plasmid pCZNSSN after the clone. 

Alternatively, expression vectors were constructed using an expression vector pGEX-2T (Pharmacia) 
designed to xpress a fused protein substantial in accordance with the protocol taught by the manufacture 
(Pharmacia). The expression vector pGEX-2T was digested with BamHI. To the linearized vector was ligated 
a Hindlll linker to obtain a DNA fragment having EcoRI and Hindlll restriction sites at its 3'- and 5*-termini. 
Each clone was digested with Hindlll and EcoRI to obtain DNA fragments encoding desired HCV 
polypeptides. The two fragments were then ligated at their Hindlll and EcoRI sites such that the frame of 
the codon is in confirmity with the amino acid of the clone. 
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For example, the following region corresponding to HCV polypeptide (hereinafter, referred to as clone 
MKCNS4bNS5) having a 863 amino acid s quence from No. 306 to 1168 of SEQ ID NO 75 was expr ssed 
in E.coli. A DNA fragment encodes MKCNS4bNS5 is named as clone MKCNS4bNS5. 

The above region appears to be a HCAg which can immunologically react with antiserum from HC 
patients in high efficiency. This region can be expressed using pCZ44 for the construction of expression 
vector. However, it also can be expressed as a fused polypeptide with GST. 

Plasmid pUC2218 (2 ng) was digested thoroughly with restriction enzymes Hindlll, Pvull and Sspl and 
separated on acrylamide gel electrophoresis. From the gel was extracted about 200 ng of DNA fragment 
containing a region from clone 2218, which fragment was then blunt-ended. The DNA fragment 2218 
(Hin/Pvu/T4) was inserted into Hindlll site of pGEXHIO which has a modified sequence of pGEX-2T, 
wherein the sequence between BamHI and EcoRI sites of pGEX-2T is changed as follows: 



5' GGATCCC CC CAA6CTTG 6G GGAATTC 3' 

BamHI Hindlll EcoRI (SEQ ID NO 187) 



The expression vector pGEXHIO (1 ng) was digested with Hindlll completely and blunt-ended. DNA 
fragment from pGEXHIO (20 ng) was ligated to 50 ng of DNA fragment 2218 (Hin/Pvu/T4), transformed, and 
cloned conventionally. The resultant plasmid pGEX2218 encodes a fused polypeptide comprising GST 
linked to the N22 region of DNA fragment 2218 (Hin/Pvu/T4). 

E.coli JM109 strain transformed with plasmid pCZNSSN was grown in L-Broth at 37 °C overnight 
(Molecular Cloning, Cold Spring Harbor, 1982). The cultured broth was diluted 50-folds by inoculating it into 
a freshly prepared L-Broth and the cultivation continued with shaking at 30 °Cfor 2 hr. At this time, IPTG 
was added to the culture to a final concentration of 2 mM in order to induce the expression of DNA 
encoding HCV-originated polypeptide by singie-clone-derived transformants (E.coli JM 109 cells trans- 
formed solely by said plasmid). Deduced amino acid sequence of cDNA derived from clone NS5N 
corresponds to that of No. 1246 to 1692 of SEQ ID NO 75. 

In the same manner as the above, plasmid pGEX2218 can be used to express a fused protein between 
polypeptide MKCNS4bNS5 and GST. The plasmid , as instructed by Pharmacia, contains a sequence 
encoding a region specifically cleaved by thrombin at C-terminal region of GST, followed by a sequence of 
clone 2218 ( it also contains a short sequence derived from pUC19). The fused protein can be expressed in 
the same manner used for the expression of HCV polypeptide encoded by plasmid pCZNSS. Thus, E.coli 
transformants transformed with plasmid pGEX2218 were grown in the presence of IPTG. 

Example 26 

Expression of MKCNS5 Region in Insect Ceils 

The expression of HCV-originated protein encoded by plasmid pBlueMKCNS5 prepared in Example 24 
[2] was conducted substantial in accordance with a known expression manual for baculovirus (MAXBAC™ 
BACULOVIRUS EXPRESSION SYSTEM MANUAL VERSION 1.4, hereinafter, referred to as Maxbac, 
Invitrogen). 

Plasmid pBlueMKCNSS prepared in Example 24 [2] by inserting DNA fragment containing HCV gene at 
the Nhel site of a transfer vector pBlueBac (Maxbac, pp.37), was recovered from E.coli host cells 
transformed thereby, and purified according to the method of Maniatis et al (Molecular Cloning, Cold Spring 
Harbor Laboratory, pp.86 - 96 (1982)). Thus, a large amount of HCV gene-containing transfer plasmid DNA 
was obtained. Sf9 cells were cotransfected with 2 ug of a plasmid containing a DNA fragment from HCV 
gene and 1 ng of AcNPV viral DNA (Maxbac, pp.27). Sf9 cells were grown in TMN-FH medium (Invitrogen) 
containing 10% FCS (fetal calf serum) in a Petri dish (6 cm diameter) until a cell density reached to about 2 
x 10 s /plate. The TMN-F medium was removed and a 0.75 ml Grace medium (Gibco) containing 10% FCS 
was added th reto. To th DNA mixture described in the abov was added 0.75 ml of transfection buffer 
(attached to the kit) was thoroughly mixed by vortex and gradually added dropwise onto the Grace medium. 
After the culture being allowed to stand for 4 hr at 27 °C, Grace medium was replaced with 3 ml of TMN-FH 
medium containing 10% FCS and the dish incubated at 27 °Cfor 6 days. Three days from the incubation, 
there observed a few multinucleate cells and on sixth day, almost all the cells were multinuclear. The 
supernatant was taken into a centrifuging tube and centrifuged at 1,000 rpm, 10 min to obtain the 
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supernatant as a cotransfected viral solution. 

The cotransfected viral solution contains about 10 8 viruses/ml and 0.5% of which were r combinant 
viruses. The isolation of recombinant virus was carried out by a plaque isolation method described below. 

Thus, cells were adsorbed onto a Petri dishes (6 cm diameter) by seeding 1.5 x 10 6 cells on medium 
and removing the medium completely. To the dish was added 100 ul of a diluted viral solution (10 4 and 
10~ 5 folds), separately and incubated at room temperature for 1 hr while slanting the dish every 15 min to 
spread the virus extensively. X-gal medium containing agarose was prepared by adding 5-bromo-4-chloro- 
3-indolyl-jS-D-gaIactoside to a final concentration of 150 ug/l to a warm medium which had been prepared 
by autoclaving 2.5% baculovirus agarose (Invitrogen) at 105 °C for 10 min, mixing with TMN-FH medium 
containing 10% FCS preheated at 46 °C at the mixing ratio of 1 : 3, and keeping the temperature at 46 °C. 

After the completion of infection, virus solution was aspirated thoroughly from the dish and 4 ml of the 
warm X-gal medium containing agarose (previously prepared) was gently added to every dish not to peel 
off cells. The dish kept open by slightly sliding a lid until the agarose solidified and dried, and thereafter the 
dish covered, turned upside down, and incubated at 27 °C for 6 days. The plaques were observed under a 
phase difference microscope to find blue plaques which do not form multinucleate cells. Agarose containing 
blue recombinant plaques were removed with an aspirating pipet and suspended into 1 ml of TMN-FH 
medium by pipetting many times. The above process whcih comprises: infection, 6-day incubation, and 
isolation of virus containing transfer plasmid DNA is called the "plaque method". The plaque method was 
repeated using 100 ul of viral suspension. After repeating said process three times, there obtained a 
recombinant virus having a gene encoding HCV glucoprotein free from contamination with that of wild-type 
strain. 

A viral solution of the primary recombinant virus was prepared by aspirating plaques with a Pasteur 
pipet, and mixing thoroughly with 1 ml of TMN-FH medium. Because the primary viral solution was low in 
virus density for infection, it required further treatments for concentration. Thus. 100 ul of viral solution was 
adsorbed onto Sf9 cells grown in a petri dish (6 cm in diameter) to a semi-confluent, and 4 ml of TMN-FH 
medium was added thereto and incubated three days. The culture supernatant was recovered to yield a 
recombinant viral solution for infection. 

For the production of HCV structural protein , a suspension of Sf9 cells in TMN-FH medium containing 
10% FCS (5 x 10 6 cells/10 ml medium) was added into a Petri dish (9 cm, in diameter) and kept 1 hr for 
adsorption. After the removal of medium, 250 ul of recombinant viral solution was added to the dish and 
spread extensively. To the dish was added 10 ml TMN-FH medium containing 10% FCS and incubated at 
27 °C for 4 days. The cells expressing recombinant glycoprotein of HCV were harvested by scraping up 
and suspended into 1 ,000 ml of phosphate buffered saline. 

Thus, HCV-derived glycoprotein was expressed in Sf9 cells transfected with said virus. 



Example 27 



Identification of Expression Products as HCAg 

Each expression product obtained in Examples 25 and 26 was identified as HCAg because it reacted 
immunologically with antiserum obtained from HC patients. Identification was conducted by Western blot 
technique. 

E. coli cells transformed with expression plasmid pCZNS5N or pGEX2218 were grown in the presence 
of IPTG for 3 hr or a overnight in the same manner as described in Example 25. 

Recombinant strains were harvested by centrifuging 1,000 ul of the cultured broth at 6,500 rpm, 10 min. 
The pellet was dissolved into a sample solution (50 mM Tris-HCI, pH6.8 containing 2% SDS, 5% 
mercaptoethanol, 10% glycerin, and 0.005% bromophenol blue) for SDS-poiyacrylamide gel electrophoresis 
to a final volume of 0.2 ml. Sf9 cells infected with viruses which had been treated more than 3 times by 
plaque method were collected by scraping up and suspended into 1,000 ml of PBS and 100 ul of the 
suspension was centrifuged at 6,500 rpm, 10 min to pellet the cells. The pellet was dissolved into a sample 
solution for SDS-polyacrylamide gel electrophoresis to a final volume of 0.2 ml. 

The sample solutions were then boiled at 100 °C for 10 min. Ten ul of the boiled solution was loaded 
onto 0.1% SDS-15% polyacrylamide g I (70 x 85 x 1 mm) together with a marker protein LMW Kit E (low- 
molecular weight marker protein, Pharmacia). Electrophoresis was carried out at a constant current of 30 
mA for 45 min in Tris buffer (25 mM Tris, pH 8.3, 192 mM glycine. 0.1% SDS) as electrode buffer. 
Thereafter, DNA was transferred electophoretically to a nitrocellulose filter by superposing the gel onto a 
filter BA-83 (S & S), impressing a constant current of 120 mA for about 20 min between gel (cathode) and 
the filter (anode) as conventionally. 
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The transcribed filter was cut to remove a part containing a marker protein (referred to as marker filter) 
and that containing the sample (referred to as sampl filter). Th former was stain d with 0.1% (w/v) 
amideblack 10B and the latter immersed into 0.01 M PBS (pH 7.4) containing 5% (w/v) bovine serum 
albumin (BSA). Serum from a HC patient was diluted 50 times with 0.01 M PBS (pH 7.4) containing 5% 
5 (w/v) BSA. To the sample filter was added 10 ul of diluted serum and the filter allowed to stand for 2 hr at 
room temperatur . Thereafter, the filter was washed with PBS containing 0.1% (WW) Tween 20 for 20 min 
(x3). 

The sample filter was then reacted with 10 ml of horseradish peroxidase conjugated anti human IgG 
(Gappel) at 37 °C for 1 hr and washed with PBS containing 0.1% (w/v) Tween 20 for 20 min (x3). The filter 

70 was then immersed into peroxidase-color-producing solution (60 mg 4-chloro-1-naphthol, 20 ml methanol, 
80 ml PBS, and 20 ul aqueous hydrogen peroxide). The colored filter was washed with distilled water and 
compared with the marker filter, demonstrating that colored protein expressed by transformants transformed 
with plasmid pCZNSSN or pGEX2218 had a reasonable molecular weight as an expression product of 
inserted HCV gene and was identified as HCAg. The expression product from transformants transformed 

75 with pGEX2218 was a fused protein consisting of HCV originated polypeptide and GST and thrombin 
cleaving site wherein the latter two attached at the N-terminus of the former. 

Example 28 



20 Preparation of Clone T7N1-25 
[1] Preparation of Clone 1925 

Clones N19MX24A-1 (prepared in Example 11[1]) and MX25-1 were ligated using overlapping region by 

25 PCR. One ul (about 0.5 to 1 ug/ul) of each DNA fragment from clones N19MX24A-1 and MX25-1 (977 and 
849 bp, respectively) was added into a reaction mixture containing 10 ul of 10 x PCR buffer (100 mM Tris- 
HCI (pH8.3). 500 mM KCI. 15 mM MgCI 2 . 1% gelatin), 8 ul of 2.5 mM 4 dNTPs, 5 ul each of 20 pmol/ul 
synthetic primers MS122 and MS152, and 76.5 ul of water. After an intimate mixing, the mixture was heated 
at 95 °Cfor 5 min, then immediately cooled to 0 °C. One minute later, to the mixture was added 0.5 ul of 

30 Taq DNA polymerase (7 U/ul, AmpliTaq™ Takara Shuzo), mixed and overlaid with mineral oil. The sample 
was then subjected to PCR. PCR was conducted by repeating 10 times of reaction cycle, which comprises 
the following treatments: at 95 °C for 1 min; at 37 °C for 1 min; and at 72 °C for 2 min in DNA Thermal 
Cycler (Parkin Elmer Cetus). It was followed by an incubation at 97 °C for 2 min. The mixture was 
immediately cooled to 0 °C, kept at the same temperature for 2 min, mixed with 0.5 ul of Taq DNA 

35 polymerase (7 U/ul, AmpliTaq™ Takara Shuzo), and overlaid with mineral oil. The sample was then treated 
in the same manner as the above by repeating 15 times of reaction cycle, which comprises the following 
treatments: at 95 °Cfor 1 min; at 55 °C for 1 min; and at 72 °C for 2 min. After the final treatment at 72 °C 
for 7 min, the resultant reaction solution was treated with phenol/chloroform then precipitated with ethanol. 
The two DNA fragments were ligated and amplified by PCR. The ligated DNA sample was fractionated on 

40 agarose gel electrophoresis and a gel containing about 1000 bp fragment was excised from the gel 
(Molecular Cloning (1982) Cold Spring Harbor). The resultant DNA fragment was then modified as 
described in Example 3 and ligated into Smal site of multi-cloning sites of pUC19, cloned and screened as 
described in Example 3 to obtain plasmid pUC1925. The resultant clone derived from serum of HC patient 
was designated as clone 1925. 

45 

[2] Preparation of Clone T7N119 

Plasmid pUCN1-1 contains cDNA clone N1-1 at Smal site of pUC19 from 5 1 to 3', Hindlll site of pUC19 
and HCV gene. The plasmid pUCN1-1 was digested with Hindlll and Ncol completely and the larger 
so fragment pUCNIHN containing the vector function was isolate~dT~Ten ng of said DNA fragment was ligated 
to the following synthetic DNAs: 

MS168: AG CTT ACT AGTT AATACG ACTC ACTAT AG G G (31 base pairs, SEQ ID NO: 188) 
MS169: CTGGCACCCTATAGTGAGTCGTATTAACTAGTA (33base pairs, SEQ ID NO: 189) 
MS170: TGCCAGCCCCCTGATGGGGGCGACACTCCACCATAGATCACTCC (44base pairs, SEQ ID NO: 
55 190) 

MS171: TCACAGGGGAGTGATCTATGGTGGAGTGTCGCCCCCATCAGGGGG(45base pairs, SEQ ID NO: 
191) 

MS172: CCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTAGC(40base pairs, SEQ ID NO: 192) 
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MS173: CATGGCTAGACGCTTTCTGCGTGAAGACAGTAGTTCC(37base pairs, SEQ ID NO: 193) 
The above DNA fragments ar shown from 5' to 3' termini. 

DNA fragments except MS168 and MS173 were kinased at 5' t rminus. A 100 pmol of each of 5'- 
kinased MS169, MS170, MS171 and MS172, and 20 pmol of each of MS168 and MSI 73 were ligated in the 

s presence of T4 DNA ligase, and the reaction mixture treated with phenol treatment and ethanol pr cipita- 
tion, conventionally. A quarter of the precipitated DNA sample was ligated to 10 ng of pUCNIHN to obtain 
plasmid pUCT7N1 which comprises from 5, to 3', Hindlll site, Spel site, promoter sequence derived from 
T7RNA polymerase, 5' non-translational region of HCV gene, DNA fragment of a gene encoding the N- 
terminal region of HCV core protein, at the 5' site of clone N1-1. The resultant plasmid pUCT7N1 contains 

10 clone T7N1 between Hindlll and Smal sites. Clone T7N1N3N10 was prepared in the same manner as that 
described in Example 4 [2] except that plasmid pUCT7N1 was used instead of pUCN1-1 having clone N1-1. 

Clones T7N1N3N10 and N27N19-1 prepared in Example 11 [2] were ligated by taking advantage of 
unique restriction site which exists in the overlapping regions of the both clones. Upon digestion with 
restriction enzyme Bam Hl, T7N1N3N10 and N27N19-1 were cleaved at 3' site of No. 1332 (G) and No. 3 

75 (G), respectively. The ligation was accomplished on the basis of an assumption that plasmids pUCN1N3N10 
and pUCN27N19-1 contain each DNA fragment in the same orientation (on the HCV gene, Hindlll site of 
pUC19 located at 5' site). Thus, plasmid pUCN119 was prepared by digesting pUCN27N19-1 with EcoRI 
and BamHl to isolate a DNA fragment containing the 5' region of clone N27N19-1 (the fragment comprises 
clone N27N19-1 attached at the 3' terminus by EcoRI-Smal fragment of plasmid pUC19), ligating said 

20 fragment to the EcoRI-BamHI fragment containing the vector function of plasmid pUCN1N3N10, cloning and 
screening. Plasmid pUCN119 contains the desired clone T7N119 comprising, from 5 1 to 3\ Hindlll site, Spel 
site, promoter sequence derived from T7RNA polymerase, a part of 5* non-translational region of HCV 
gene, clones N1-1, N3-1, N10-1, N27-3, N19-1 without overlapping. 

25 [3] Preparation of Clone T7N1-25 

Clones T7N119 and 1925 prepared in the above [1] were ligated by taking advantage of unique 
restriction site which exists in the overlapping regions of the both clones. Upon digestion with restriction 
enzyme Pyul, clone T7N119 was cleaved at 3' site of No. 288 (T) of basse sequence of clone N19-1 in N19 

30 region which is shown by SEQ ID NO 16, and clone 1925 was cleaved at 3' of No.288 (T). The ligation of 
T7N119 and 1925 clones was accomplished on the basis of an assumption that plasmids pUCT7N119 and 
pUC1925 contain each DNA fragment in the same orientation. Thus, plasmid pUCTN119 was prepared by 
digesting pUC1925 with Pvull and EcoRI to isolate a DNA fragment encoding HCV originated gene (said 
DNA fragment contains at 3' of said cDNA a EcoRI-Smal fragment of plasmid pUC19), exchanging the 

35 Pvull-EcoRI fragment containing 3' region of N19 region of plasmid pUCT7N119 with the fragment obtained 
from plasmid pUCTN1-25, cloning, and screening. 

Plasmid pUCT7N1-25 contains the desired clone T7N1-25 comprising clones T7N119 and 1925 ligated 
at Pvul site without overlapping. 

40 Example 29 

Preparation of Clone T7N1-30U 

[1] Preparation of Clone 1530UNot 

45 

The clone 1530U prepared in Example 23[5] contains Hindlll site adjacent to 3' site of cDNA of HCV. 
Plasmid PUC1530U was digested completely with Hindlll, blunt ended with T4DNA polymerase convention- 
ally. Ten ng of resultant DNA fragment was ligated to an excess amount of EcoRI-Notl-BamHI adapter (x 
100 molar, Toyobo) in the presence of T4 DNA ligase, conventionally. Afterlhe "phenol treatment and 
so ethanol precipitation, the fragment was digested with Notl, ligated, cloned and screened to yield plasmid 
pUCl530UNot. 

[2] Preparation of Clone T7N1-30U 

55 Clones T7N1-25 prepared in Example 28, MX25N15-1 prepared in Example 17 [4], and 1530UNot 
obtained in the present Example were ligated by taking advantage of unique restriction site which exists in 
the overlapping regions of the clones. PUCT7N1-25 was digested with Spel and Pstl and about 1 ng of a 
DNA fragment T7N1-325SP containing the majority of clone T7N1-25 was extracted from gel. Plasmid 
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pUCMX25N15-1 was digested with Pstl and EcoT221 and about 1 ng of a ONA fragment MX25N15-1PE 
containing th majority of clone MX25N15-1 was extracted from gel. Plasmid pUC1530UNot was dig sted 
with EcoT22l and Notl and about 1 ng of a DNA fragment 1530UEN containing the majority of clone 
1530UNot was isolated from gel. 

5 About 200 ng of each of the above fragments T7N1-25SP, MX25N15-1PE, 1530UEN, and 1 ng of 
Specl-Notl fragment of XZapll (Strategene) were ligated according to the protocol attached to the kit. It was 
followed by packaging with GIGAPACKII PACKING EXTRACTS, GOLD (Strategnen). All the procedures 
including ligation, titer check, amplification of XDNA, isolation and packaging were conducted according to 
the teaching of protocol attached thereto. The screening of recombinant phage was carried out for the 

w inserted DNA fragment by isolating 20 white plaques, subcloning into plasmid pBBLUESCRIPT SK (-). 
Among 2 clones of 20 clones subcloned into plasmid pBBLUESCRIPT SK (-) contained a DNA fragment 
having three sequences of HCV gene between Sped and Notl site of said plasmid XZapll (from 5* Spec ! 
site to 3': clone T7N1-25SP, MX25N15-1PE and 1530UE"N)7 The resultant plasmid was designated as 
pT7NI-30U. 

75 The plasmid pT7N1-30U contains a clone T7N1-30U comprising three DNA fragments originated from 
HCV ligated without overlapping Sped and Notl sites. Base and amino acid sequence of polypeptide 
encoded by said clone are shown in SEQ ID NOT01. 

Example 30 

20 

Large-Scale-Expression of Polypeptides CORE and C + N23 
[1] Preparation of clone CN23 



25 A region of clone N23-1 to be expressed was obtained by PCR using as a template, pUCN23-1 having 
clone N23-1 prepared in Example 16. The following synthetic DNAs were used as primers. 
5' primer: 

MS165: 5' GCAAGCTTATGCTGCTGTCGCCCGGGCCCATCT3' (SEQ ID NO: 194) 
3' primer: 

30 MS166: 5' GCGAATTCAGATCTTCATCATGTGTTGCAGTCGATCAC 3* (SEQ ID NO: 195) 
The synthetic DNA was adjusted to 20 p mo l/ml before use. 

PCR was carried out in the same manner as described in the above according to Saiki's method in a 
total volume of 100 ul containing 100 ng of plasmid pUCN23-1, as a template, and 2 &\ each of 3' and 5' 
primers. The reaction mixture was heated at 95 °C for 5 min and quenched at 0 °C. One minute later, to the 

35 mixture was added 0.5 ul of Taq DNA polymerase (7 U/ml, AmpliTaq™ Takara Shuzo), mixed thoroughly 
and overlaid with mineral oil. The sample was reacted by repeating 8 cycles of treatments which comprises: 
at 95 °C for 1 minute; at 55 °C for 1 min; and at 72 °C for 1 min in DNA Thermal Cycler (Parkin Elmer 
Cetus). It was followed by 17 times of reaction cycles comprising, at 95 °C for 1 minute; at 65 °C for 1 min; 
and at 72 °C for 1 min. The resultant readion solution was extraded with phenol/chloroform, and 

40 precipitated with ethanol conventionally. The amplified DNA samples were digested with Hindi II and EcoRI, 
and fractionated on acrylamide gel electrophoresis and extracted. 

The DNA fragment was then ligated into Hindlll and EcoRI sites of cloning vector pUC19, cloned 
screened to obtain plasmid pUCN23A. The base sequence of clone N23A shows that it comprises a DNA 
fragment shown by a base sequence from Nos. 1 to 915 of SEQ ID No 50 having additional DNA fragments 

45 attached to the both 5'- and 3*-termini. That is, at its 5'-terminus, the following DNA fragment comprises a 
Hindlll restriction site followed by an initiation codon ATG was attached. 



5' GCAAGCTTATG 3' 
50 3' CGTTCGAATAC 5' (SEQ ID NO 155) 

And at its S'-terminus, the following DNA fragment compris s two termination codons, Bglll and EcoRI sites 
from 5 f to 3 f was attached. 

55 
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5' TGATGAAGATCTGAATTCGC 3' 

3' ACTACTTCTAGACTTAAGCG 5' (SEQ ID NO 156) 

Plasmid pCZCORE obtained in Example 6 [1] was digested with Sacll and blunt ended with T4DNA 
polymerase conventionally, which was followed by the digestion with Bglll and subjected to acrylamide gel 
electrophoresis. From the gel, a DNA fragment pCZCORE/SB containing a vector part of vector pCZ and 
the N-terminal region of core protein of HCV was extracted. 

In the same manner, plasmid pUCN23A was digested with Smal and Bglll completely and subjected to 
acrylamide gel electrophoresis. From the gel, a DNA fragment N23A/SB containing the sequence of clone 
N23-1 was extracted, which fragment contains, from 5' terminus, a base sequence from No.107 (G) to No. 
915 (A) of SEQ ID NO 50 and two stop codons and a Bglll site. 

Ten ng of a DNA fragment pCZCORE/SB and 100 ng of N23A/SB were ligated conventionally to obtain 
plasmid pCZCN23, which contains clone CN23 of SEQ ID NO 102between Hindlll and Bglll sites. 

[2] Modification of expres sion vector 

The improvement of the efficiency of expression was accomplished by making the expression unit in 
expression vector multiple. Thus, plasmids pCZCORE and pCZCN23 were digested thoroughly with 
restriction enzymes BamHI and Bglll, and the resultant DNA fragments CORE/BB and CN23/BB encoding a 
polypeptide derived from HCV was recovered. 

The DNA fragment CORE/BB (100 ng) was ligated by T4DNA ligase at 12 °C for 30 minutes according 
to a conventional method. The resultant material was worked up with phenol treatment and ethanol 
precipitation, digested with restriction enzymes BamHI and Bglll, digested thoroughly with Bglll, and ligated 
to plasmid pCZCORE (10 ng) previously dephosphorylated with alkali phosphatase by~a conventional 
method to obtain plasmid pCZCORE tandem 2, 3, 4, 8, 16, in which 2, 3, 4, 6, 12 expression units of 
polypeptide CORE between BamHI and Bglll sites of plasmid pCZCORE are ligated forwardiy in tandem. 
The same procedure was conducted with the DNA fragment CN23/BB and plasmid pCACN23 to obtain 
plasmid pCZCN23 tandem 2, 3, 4, 6. 

13] Direct Expression of polypeptides CORE and CN23 in Large Scale 

Expression of polypeptide CORE and CN23 in E. coli was conducted using each of expression vector 
obtained in the above [2] in the same method in Example 6 [1]. 

For this purpose, conditions such as the timing for induction, species or strains of host cells, number of 
tandem and the temperature of the culture in the system transformed with pCZCORE were studied. 

For example, hosts derived from K12 strain such as JM109, DH5, KS476 and hosts derived from B 
strain were studied. The degree of expression varies depending on the host. The host derived from B strain 
and KS476 gave an excellent expression, and the expression amount per culture medium was about 8 to 10 
times larger than that obtained using DH5, as host cells. The quantities also varied depending on the time 
for induction. Thus, 0.5 ml of overnight culture containing transformants (OD 600 = about 1.5) was 
inoculated into 10 ml bactopepton medium (Difco; 20 g/l bactpepton, 0.2% v/v glycerin, 0.1 M MgSO*. 10 
g/l NaCI, 160 yJ/I of 0.1% thiamin chloride, 100 mg/I ampicillin) in 10 ml L-shaped tube and cultured at 
30°C. IPTG was added either of the time when the conductivity (OD 600) reached to about 0.5, 0.8, 1 .2, 2.0 
and 3.0 for induction. The cultured broth which was induced when the OD 600 reached to about 0.5 gave 
the best expression and the amounts of the expression product was highest. The expression was not 
directly proportional to the number of tandem. For example, when cells transformed with expression 
plasmid containing in tandem three units of an expression unit CORE/BB, the expression efficiency was low, 
whereas, it was drastically increased when the plasmid contains 4 units in tandem and kept increase until 
the number of units becomes 8. However, significant improvement was no more observed and the 
expression amount was almost the same between cultures containing host cells transformed with tandem 8 
and 16. The above studies provided the condition for large-scale-expression of polypeptide CORE as 
follows. A host cell derived from B strain or KS476 was transformed with pCZCORE tandem 8 and cultured 
30°C overnight, inducing the expression when the density reached to about 0.5 (OD 600). Among the 
plasmid pCZCN23 tandem 2, 3, 4, 6 prepared for the expression of polypeptide CN23, tandem 6 was used 
and the expression was carried out under the same condition as that used for pCZCORE tandem. 



73 



EP 0 518 313 A2 



SEQ ID NO:l 

SEQUENCE LENGTH: 483 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: Nl-1 

CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC 60 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGACCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATC 180 
AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACTGCTAGC CGAGTAGTGT 240 
TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATAGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCATC ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
15 10 
AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT 398 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG T 483 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 

SEQ ID NO: 2 

SEQUENCE LENGTH: 187 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONES N2-1 



AGGTCTCGTA GACCGTGCAT C 

ACC AAA CGT AAC ACC AAC 
Thr Lye Arg Aan Thr Asn 

15 

GGT GGT CAG ATC GTT GGT 
Gly Gly Gin He Val Gly 

30 

AGG TTG GGT GTG CGC GCG 
Arg Leu Gly Val Arg Ala 
45 



ATG AGC ACA AAT CCA AAA 

Met Ser Thr Asn Pro Lys 

1 5 
CGC CGC CCA CAG GAC GTC 
Arg Arg Pro Gin Asp Val 

20 

GGA GTT TAG CTG TTG CCG 
Gly Val Tyr Leu Leu Pro 
35 

ACT AGG AAG ACT TCC GAG 
Thr Arg Lys Thr Ser Glu 
50 



CCC CAA AGA AAA 51 
Pro Gin Arg Lys 

10 

AAG TTC CCG GGC 99 
Lys Phe Pro Gly 
25 

CGC AGG GGC CCC 147 
Arg Arg Gly Pro 
40 

CGG T 187 
Arg 
55 



SEQ ID NO: 3 

SEQUENCE LENGTH: 531 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 

MOLECULE TYPE: cDNA to genomic RNA 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N3-1 



AGGTCTCGTA GACCGTGCAT C ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC 54 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
1 5 10 

AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT 102 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG 150 
Gly Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 
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15 



20 



25 



30 



TTG GGT 

Leu Gly 
45 

GGA AGG 
Gly Arg 

60 
TGG GCT 
Trp Ala 

GGG TGG 
Gly Trp 

GGC CCC 
Gly Pro 

GAT ACC 
Asp Thr 
125 
GTC GGC 
Val Gly 
140 

CGG GTT 
Arg Val 



30 
GTG CGC 
Val Arg 

CGA CAA 
Arg Gin 

CAG CCC 
Gin Pro 



GCA 
Ala 

ACG 
Thr 
110 
CTC 
Leu 



GGA 
Gly 
95 
GAC 
Asp 

ACA 
Thr 



GCC CCC 
Ala Pro 

CTG GAG 
Leu Glu 



GCG ACT 
Ala Thr 

CCT ATC 
Pro lie 
65 

GGG TAC 
Gly Tyr 

80 
TGG CTC 
Trp Leu 

CCC CGG 
Pro Arg 

TGC GGC 
Cys Gly 

CTA GGG 
Leu Gly 
145 
GAC GGC 
Asp Gly 
160 



35 

AGG AAG ACT 
Arg Lys Thr 
50 

CCC AAG GCT 
Pro Lys Ala 

CCT TGG CCC 
Pro Trp Pro 

CTG TCA CCC 
Leu Ser Pro 
100 

CGT AGG TCG 
Arg Arg Ser 

115 
TTC GCC GAT 
Phe Ala Asp 
130 

GGC GCT GCC 
Gly Ala Ala 

GTG AAC TAC 
Val Asn Tyr 



TCC GAG 
Ser Glu 

CGC CAA 
Arg Gin 
70 

CTC TAT 
Leu Tyr 

85 
CGC GGC 
Arg Gly 

CGT AAT 
Arg Asn 

CTC ATG 
Leu Met 

AGG GCT 
Arg Ala 
150 
GCA ACA 
Ala Thr 
165 



40 

CGG TCG 
Arg Ser 

55 
CCC GAG 
Pro Glu 

GGC AAT 
Gly Asn 

TCC CGG 
Ser Arg 

TTG GGT 
Leu Gly 
120 
GGT ACA 
Gly Tyr 
135 

CTA GCG 
Leu Ala 

GGG AAC 
Gly Asn 



CAA CCT CGT 198 
Gin Pro Arg 

GGC AGG GCC 246 
Gly Arg Ala 
75 

GAG GGC TTG 294 
Glu Gly Leu 
90 

CCT AGT TGG 342 

Pro Ser Trp 

105 

AAG GTC ATC 390 
Lys Val lie 

TTC CGC TCG 438 
lie Pro Leu 

CAT GGC GTC 486 
His Gly Val 
155 

TTG CCC 531 
Leu Pro 
170 



SEQ ID NO: 4 
35 SEQUENCE LENGTH: 755 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY 2 linear 
40 MOLECULE TYPE: CDNA to genomic RNA 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
45 IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N10-1 
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40 



45 



C GTG AAC TAT GCA ACA GGG AAT CTG CCT GGT TGC TCC TTT TCT ATC TTC 49 
Val Asn Tyr Ala Tyr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie Phe 
1 5 io 15 

CTT TTG GCT TTG CTG TCC TGT TTG ACC ATC CCA GCT TCC GCC TAG CAA 97 
Leu Leu Ala Leu Leu Ser Cys Leu Thr lie Pro Ala Ser Ala Tyr Gin 

20 25 30 

GTG CGC AAC GCG TCC GGG GTG TAC CAT GTC ACG AAC GAC TGC TCC AAC 145 
Val Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp Cys Ser Asn 

35 40 45 

TCA AGT ATT GTG TAT GAG GCG GCG GAC GTG ATT ATG CAC ACC CCC GGG 193 
Ser Ser Ila Val Thr Glu Ala Ala Asp Val lie Met His Thr Pro Gly 

50 55 60 

TGC GTG CCC TGC GTC CGG GAG AAC AAT TCC TCC CGC TGC TGG GTA GCG 241 
Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala 
65 70 75 80 

CTC ACT CCC ACG CTT GCG GCC AGG AAC AGC AGC ATC CCC ACT ACG ACA 289 
Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser Ser lie Pro Thr Thr Thr 

85 90 95 

ATA CGG CGT CAT GTC GAC TTG CTC GTT GGG GCA GCT GTC CTC TGT TCC 337 
lie Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Leu Cys Ser 

100 105 110 

GCT ATG TAT GTG GGG GAT TTT TGC GGA TCT GTT TTC CTC GTC TCC CAG 385 
Ala Met Tyr Val Gly Asp Phe Cys Gly Ser Val Phe Leu Val Ser Gin 

115 120 125 

CTG TTC ACT TTC TCA CCT CGC CGG TAT GAG ACG GTG CAA GAC TGC AAT 433 
Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu Thr Val Gin Asp Cys Asn 

130 135 140 

TGC TCA ATC TAT CCC GGC CAT GTA TCA GGC CAT CGC ATG GCT TGG GAT 481 
Cys Ser He Tyr Pro Gly His Val Ser Gly His Arg Met Ala Trp Asp 
145 150 155 160 

ATG ATA ATG AAT TGG TCA CCT ACA ACA GCC CTA GTG GTA TCG CAG CTA 529 
Met He Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu 

165 170 175 

CTC CGG ATC CCA CAA GCC GTC GTG GAT ATG GTG GCG GGG GCC CAC TGG 577 
Leu Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp 

180 185 190 

GGA GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT 625 
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Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala 

195 200 205 

AAG GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC GGG GGG ACC 673 
Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly Gly Thr 

210 215 220 

CAC GTG ACA GGG GGA AAG GTA GCC TAC ACC ACC CAG AGC TTT ACA TCC 721 
His Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Ser Phe Thr Ser 
225 230 235 240 

TTC TTT TCA CGA GGG CCG TCT CAG AGA ATC CAG C 
Phe Phe Ser Arg Gly Pro Ser Gin Arg He Gin 

245 250 

SEQ ID NO: 5 

SEQUENCE LENGTH: 1258 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 

MOLECULE TYPE: cDNA to genomic RNA 

ANTI -SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N3N10 



AGGTCTCGTA GACCGTGCAT C 

AAA CGT AAC ACC AAC CGC 
Lys Arg Asn Thr Asn Arg 

15 

GGT CAG ATC GTT GGT GGA 
Gly Gin He Val Gly Gly 
30 

TTG GGT GTG CGC GCG ACT 
Leu Gly Val Arg Ala Thr 
45 

GGA AGG CGA CAA CCT ATC 



ATG AGC ACA AAT CCA AAA 
Met Ser Thr Asn Pro Lys 

1 5 
CGC CCA CAG GAC GTC AAG 
Arg Pro Gin Asp Val Lys 
20 

GTT TAC CTG TTG CCG CGC 
Val Tyr Leu Leu Pro Arg 
35 

AGG AAG ACT TCC GAG CGG 
Arg Lys Thr Ser Glu Arg 
50 55 
CCC AAG GCT CGC CAA CCC 



CCC CAA AGA AAA ACC 54 
Pro Gin Arg Lys Thr 

10 

TTC CCG GGC GGT 102 
Phe Pro Gly Gly 
25 

AGG GGC CCC AGG 150 
Arg Gly Pro Arg 
40 

TCG CAA CCT CGT 198 
Ser Glu Pro Arg 

GAG GGC AGG GCC 246 
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35 



40 



Gly Arg Arg 
60 

TGG GCT CAG 
Trp Ala Gin 

GGG TGG GCA 
Gly Trp Ala 

GGC CCC ACG 
Gly Pro Thr 
110 

GAT ACC CTC 
Asp Thr Leu 

125 
GTC GGC GCC 
Val Gly Ala 
140 

CGG GTT CTG 
Arg Val Leu 



Gin Pro lie Pro 
€5 
TAG CCT 
Tyr Pro 



CCC GGG 
Pro Gly 
80 

GGA TGG 
Gly Trp 

95 
GAC CCC 
Asp Pro 

ACA TGC 
Thr Cys 

CCC CTA 
Pro Leu 



45 



TGC TCC TTT 
Cys Ser Phe 

CCA GCT TCC 
Pro Ala Ser 
190 

ACG AAC GAC 
Thr Asn Asp 

205 
ATT ATG CAC 
lie Met His 
220 

TCC CGC TGC 
Ser Arg Cys 

AGC ATC CCC 
Ser lie Pro 



GAG GAC 
Glu Asp 
160 
TCT ATC 
Ser lie 
175 

GCC TAC 
Ala Tyr 

TGC TCC 
Cys Ser 

ACC CCC 
Tyr Pro 

TGG GTA 
Trp Val 
240 
ACT ACG 
Thr Thr 



CTC CTG 
Leu Leu 

CGG CGT 
Arg Arg 

GGC TTC 
Gly Phe 
130 
GGG GGC 
Gly Gly 
145 

GGC GTG 
Gly Val 

TTC CTT 
Phe Leu 

CAA GTG 
Gin Val 

AAC TCA 
Asn Ser 
210 
GGG TGC 
Gly Cys 
225 

GCG CTC 
Ala lieu 



Lys Ala Arg Gin Pro Glu 

70 

CTC TAT 
Leu Tyr 

85 
CGC GGC 
Arg Gly 



TGG CCC 
Trp Pro 



GGC AAT 
Gly Asn 



TCA CCC 
Ser Pro 
100 
AGG TCG 
Arg Ser 
115 

GCC GAT 
Ala Asp 

GCT GCC 
Ala Ala 

AAC TAT 
Asn Tyr 

TTG GCT 
Leu Ala 
180 
CGC AAC 
Arg Asn 
195 

AGT ATT 
Ser lie 



TCC CGG 
Ser Arg 



CGT AAT 
Arg Asn 

CTC ATG 
Leu Met 

AGG GCT 
Arg Ala 
150 
GCA ACA 
Ala Thr 
165 

TTG CTG 
Leu Leu 



TTG GGT 
Leu Gly 
120 
GGT ACA 
Gly Tyr 
135 

CTA GCG 
Leu Ala 



GGG AAC 
Gly Asn 

TCC TGT 
Ser Cys 



GCG TCC 
Ala Ser 



GTG TAT 
Val Tyr 



ACA ATA 
Thr He 



GTG CCC TGC GTC 
Val Pro Cys Val 

230 

ACT CCC ACG CTT 
Thr Pro Thr Leu 
245 

CGG CGT CAT GTC 
Arg Arg His Val 



GGG GTG 
Gly Val 
200 
GAG GCG 
Glu Ala 

215 
CGG GAG 
Arg Glu 

GCG GCC 
Ala Ala 

GAC TTG 
Asp Leu 



Gly Arg Ala 
75 

GAG GGC TTG 294 
Glu Gly Leu 
90 

CCT AGT TGG 342 
Pro Ser Trp 
105 

AAG GTC ATC 390 
Lys Val He 

TTC CGC TCG 438 
He Pro Leu 

CAT GGC GTC 486 
His Gly Val 
155 

CTG CCT GGT 534 
Leu Pro Gly 
170 

TTG ACC ATC 582 

Leu Thr He 

185 

TAC CAT GTC 630 
Tyr His Val 



GCG GAC GTG 678 
Ala Asp Val 

AAC AAT TCC 726 
Asn Asn Ser 
235 

AGG AAC AGC 774 
Arg Asn Ser 
250 

CTC GTT GGG 822 
Leu Val Gly 
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75 



20 



255 260 265 

GCA GCT GCT CTC TGT TCC GCT ATG TAT GTG GGG GAT TTT TGC GGA TCT 870 
Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Phe Cys Gly Ser 

270 275 280 

GTT TTC CTC GTC TCC CAG CTG TTC ACT TTC TCA CCT CGC CGG TAT GAG 918 
Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu 

285 290 295 

ACG GTG CAA GAC TGC AAT TGC TCA ATC TAT CCC GGC CAT GTA TCA GGC 966 
Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His Val Ser Gly 
300 305 310 315 

CAT CGC ATG GCT TGG GAT ATG ATA ATG AAT TGG TCA CCT ACA ACA GCC 1014 
His Arg Met Ala Trp Asp Met He Met Asn Trp Ser Pro Thr Thr Ala 

320 325 330 

CTA GTG GTA TCG CAG CTA CTC CGG ATC CCA CAA GCC GTC GTG GAT ATG 1062 
Leu Val Val Ser Gin Leu Leu Arg He Pro Gin Ala Val Val Asp Met 

335 340 345 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC TAC TAT TCC 1110 
Val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser 

350 355 360 

ATG GTG GGG AAC TGG GCT AAG GTC TTG GTT GTG ATG CTG CTC TTC GCC 1158 
Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 

365 370 375 

GGT GTT GAC GGG GGG ACC CAC GTG ACA GGG GGA AAG GTA GCC TAC ACC 1206 
Gly Val Asp Gly Gly Thr His Val Thr Gly Gly Lys Val Ala Tyr Thr 
380 385 390 395 

ACC CAG AGC TTT ACA TCC TTC TTT TCA CGA GGG CCG TCT CAG AGA ATC 1254 
Thr Gin Ser Phe Thr Ser Phe Phe Ser Arg Gly Pro Ser Gin Arg He 

400 405 410 

CAGC 
Gin 

40 SEQ ID NOt6 

SEQUENCE LENGTH: 1554 base pairs 

SEQUENCE TYPE: nucleic acid 

STRAND EDNESS : double 
45 TOPOLOGY: linear 

MOLECULE TYPE: cDNA to genomic RNA 
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ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N1N3N10 

CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC €0 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGACCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATC 180 
AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACTGCTAGC CGAGTAGTGT 240 
TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATAGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCATC ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
1 5 10 

AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT 398 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAG CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GTT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG TCG CAA CCT CGT 494 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg 

45 50 55 

GGA AGG CGA CAA CCT ATC CCC AAG GCT CGC CAA CCC GAG GGC AGG GCC 542 
Gly Arg Arg Gin Pro lie Pro Lys Ala Arg Gin Pro Glu Gly Arg Ala 
60 65 70 75 

TGG GCT CAG CCC GGG TAG CCT TGG CCC CTC TAT GGC AAT GAG GGC TTG 590 
Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu 

80 85 90 

GGG TGG GCA GGA TGG CTC CTG TCA CCC CGC GGC TCC CGG CCT AGT TGG 638 
Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp 
40 95 100 105 

GGC CCC ACG GAC CCC CGG CGT AGG TCG CGT AAT TTG GGT AAG GTC ATC 686 
Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie 
110 115 120 

45 GAT ACC CTC ACA TGC GGC TTC GCC GAT CTC ATG GGT ACA TTC CGC TCG 734 
Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 
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45 



125 130 135 

GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG GCT CTA GCG CAT GGC GTC 782 
Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val 
140 145 150 155 

CGG GTT CTG GAG GAG GGC GTG AAC TAT GCA ACA GGG AAT CTG CCT GGT 830 
Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 

160 165 170 

TGC TCC TTT TCT ATC TTC CTT TTG GCT TTG CTG TCC TGT TTG ACC ATC 878 
Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr He 

175 180 185 

CCA GCT TCC GCC TAG CAA GTG CGC AAC GCG TCC GGG GTG TAG CAT GTC 926 
Pro Ala Ser Ala Tyr Gin Val Arg Asn Ala Ser Gly Val Tyr His Val 

190 195 200 

ACG AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCG GCG GAC GTG 974 
Thr Asn Asp Cys Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Val 

205 210 215 

ATT ATG CAC ACC CCC GGG TGC GTG CCC TGC GTC CGG GAG AAC AAT TCC 1022 
He Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser 
220 225 230 235 

TCC CGC TGC TGG GTA GCG CTC ACT CCC ACG CTT GCG GCC AGG AAC AGC 1070 
Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser 

240 245 250 

AGC ATC CCC ACT ACG ACA ATA CGG CGT CAT GTC GAC TTG CTC GTT GGG 1118 
Ser He Pro Thr Thr Thr He Arg Arg His Val Asp Leu Leu Val Gly 

255 260 265 

GCA GCT GCT CTC TGT TCC GCT ATG TAT GTG GGG GAT TTT TGC GGA TCT 1166 
Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Phe Cys Gly Ser 

270 275 280 

GTT TTC CTC GTC TCC CAG CTG TTC ACT TTC TCA CCT CGC CGG TAT GAG 1214 
Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu 

285 290 295 

ACG GTG CAA GAC TGC AAT TGC TCA ATC TAT CCC GGC CAT GTA TCA GGC 1262 
Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His Val Ser Gly 
300 305 310 315 

CAT CGC ATG GCT TGG GAT ATG ATA ATG AAT TGG TCA CCT ACA ACA GCC 1310 
His Arg Met Ala Trp Asp Met He Met Asn Trp Ser Pro Thr Thr Ala 

320 325 330 
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1554 



CTA GTG GTA TCG CAG CTA CTC CGG ATC CCA CAA GCC GTC GTG GAT ATG 1358 
Leu Val Val Ser Gin Leu Leu Arg He Pro Gin Ala Val Val Asp Met 

33 5 340 345 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC TAC TAT TCC 1406 
Val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser 

350 355 360 

ATG GTG GGG AAC TGG GCT AAG GTC TTG GTT GTG ATG CTG CTC TTC GCC 1454 
Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 

3 « 370 375 

GGT GTT GAC GGG GGG ACC CAC GTG ACA GGG GGA AAG GTA GCC TAC ACC 1502 
Gly Val Asp Gly Gly Thr His Val Thr Gly Gly Lys Val Ala Tyr Thr 
380 385 390 395 

ACC CAG AGC TTT ACA TCC TTC TTT TCA CGA GGG CCG TCT CAG AGA ATC 1550 
Thr Gin Ser Phe Thr Ser Phe Phe Ser Arg Gly Pro Ser Gin Arg He 

400 405 410 

CAG C 
Gin 

SEQ ID NO: 7 

SEQUENCE LENGTH: 370 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY : linear 

MOLECULE TYPE: cDNA to genomic RNA 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: HN3 

GCAAGCTT ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC AAA CGT 47 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg 
1 5 io 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG 95 
Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin 

15 20 . 25 

ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT 143 
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w 



15 



20 



25 



lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
30 35 40 45 

GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG CCG CAA CCT CGT GGA AGG 191 
Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Pro Gin Pro Arg Gly Arg 

50 55 60 

CGA CAA CCT ATC CCC AAG GCT CGC CAA CCC GAG GGT AGG GCC TGG GCT 239 
Arg Gin Pro lie Pro Lys Ala Arg Gin Pro Glu Gly Arg Ala Trp Ala 

65 70 75 

CAG CCC GGG TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TTG GGG TGG 287 
Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp 

80 85 90 

GCA GGA TGG CTC CTG TCA CCC CGC GGC TCC CGG CCT AGT TGG GGC CCC 335 
Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro 

95 100 105 

ACG GAC CCC CGG CGT AGG TGAAGATCTG AATTCGC 370 
Thr Asp Pro Arg Arg Arg 
110 115 

SEQ ID NO: 8 

SEQUENCE LENGTH: 1264 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 

MOLECULE TYPE: cDNA to genomic RNA 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: HN3N10AB 

GCAAGCTT ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC AAA CGT 47 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg 
15 10 
AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG 95 
45 Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin 
15 20 25 

ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT 143 
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20 
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30 
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40 



45 



lie Val 

30 
GTG CGC 
Val Arg 

CGA CAA 
Arg Gin 

CAG CCC 
Gin Pro 

GCA GGA 
Ala Gly 
95 

ACG GAC 
Thr Asp 
110 

CTC ACA 
Leu Thr 

GCC CCC 
Ala Pro 

CTG GAG 
Leu Glu 

TTT TCT 

Phe Ser 
175 
TCC GCC 
Ser Ala 
190 

GAC TGC 
Asp Cys 

CAC ACC 
His Thr 



Gly Gly Val Tyr 

35 

AGG AAG 
Arg Lys 

50 
CCC AAG 
Pro Lys 



GCG ACT 
Ala Thr 



Leu Leu Pro 

ACT TCC GAG 
Thr Ser Glu 



CCT ATC 
Pro lie 
65 

GGG TAC 
Gly Tyr 

80 
TGG CTC 
Trp Leu 

CCC CGG 
Pro Arg 

TGC GGC 
Cys Gly 

CTA GGG 
Leu Gly 
145 
GAC GGC 
Asp Gly 
160 

ATC TTC 
lie Phe 

TAC CAA 
Tyr Gin 

TCC AAC 
Ser Asn 

CCC GGG 
Pro Gly 



GCT CGC 
Ala Arg 



CCT TGG 
Pro Trp 

CTG TCA 
Leu Ser 

CGT AGG 
Arg Arg 
115 
TTC GCC 
Phe Ala 
130 

GGC GCT 
Gly Ala 

GTG AAC 
Val Asn 

CTT TTG 
Leu Leu 

GTG CGC 
Val Arg 
195 
TCA AGT 
Ser Ser 
210 

TGC GTG 
Cys Val 



CCC 
Pro 

CCC 
Pro 
100 
TCG 
Ser 



CTC 
Leu 
85 
CGC 
Arg 

CGT 
Arg 



CAA 
Gin 
70 
TAT 
Tyr 

GGC 
Gly 

AAT 
Asn 



GAT CTC ATG 
Asp Leu Met 



GCC 
Ala 

TAT 
Tyr 

GCT 
Ala 
180 
AAC 
Asn 



AGG 
Arg 

GCA 
Ala 
165 
TTG 
Leu 



GCT 
Ala 
150 
ACA 
Thr 

CTG 
Leu 



GCG TCC 
Ala Ser 



ATT GTG TAT 

lie Val Tyr 

CCC TGC GTC 
Pro Cys Val 



Arg Arg 
40 

CGG TCG 
Arg Ser 

55 
CCC GAG 
Pro Glu 

GGC AAT 
Gly Asn 

TCC CGG 
Ser Arg 

TTG GGT 
Leu Gly 
120 
GGT ACA 
Gly Tyr 
135 

CTA GCG 
Leu Ala 

GGG AAC 
Gly Asn 

TCC TGT 
Ser Cys 

GGG GTG 
Gly Val 
200 
GAG GCG 
Glu Ala 
215 

CGG GAG 
Arg Glu 



Gly Pro Arg 

CAA CCT CGT 
Gin Pro Arg 

GGC AGG GCC 
Gly Arg Ala 
75 

GAG GGC TTG 
Glu Gly Leu 
90 

CCT AGT TGG 
Pro Ser Trp 
105 

AAG GTC ATC 
Lys Val He 

TTC CGC TCG 
He Pro Leu 

CAT GGC GTC 
His Gly Val 
155 

CTG GGT GGT 
Leu Pro Gly 
170 

TTG ACC ATC 

Leu Thr He 
185 

TAC CAT GTC 
Tyr His Val 

GCG GAC GTG 
Ala Asp Val 

AAC AAT TCC 

Asn Asn Ser 



Leu Gly 
45 

GGA AGG 
Gly Arg 

60 
TGG GCT 
Trp Ala 

GGG TGG 
Gly Trp 

GGC CCC 
Gly Pro 

GAT ACC 
Asp Thr 
125 
GTC GGC 
Val Gly 
140 

CGG GTT 
Arg Val 

TGC TCC 
Cys Ser 

CCA GCT 
Pro Ala 



ACG 
Thr 

ATT 
He 
220 
TCC 
Ser 



AAC 
Asn 
205 
ATG 
Met 

CGC 
Arg 



191 



239 



287 



335 



383 



431 



479 



527 



575 



623 



671 



719 
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225 230 235 

TGC TGG GTA GCG CTC ACT CCC ACG COT GCG GCC AGG AAC AGC AGC ATC 767 
Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser Ser He 

240 245 250 

CCC ACT ACG ACA ATA CGG CGT CAT GTC GAC TTG CTC GTT GGG GCA GCT 815 
Pro Thr Thr Thr He Arg Arg His Val Asp Leu Leu Val Gly Ala Ala 

255 260 265 

GCT CTC TGT TCC GCT ATG TAT GTG GGG GAT TTT TGC GGA TCT GTT TTC 963 
Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Phe Cys Gly Ser Val Phe 
270 275 280 285 

CTC GTC TCC CAG CTG TTC ACT TTC TCA CCT CGC CGG TAT GAG ACG GTG 911 
Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu Thr Val 

290 295 300 

CAA GAC TGC AAT TGC TCA ATC TAT CCC GGC CAT GTA TCA GGC CAT CGC 959 
Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His Val Ser Gly His Arg 

305 310 315 

ATG GCT TGG GAT ATG ATA ATG AAT TGG TCA CCT ACA ACA GCC CTA GTG 1007 
Met Ala Trp Asp Met lie Met Asn Trp Ser Pro Thr Thr Ala Leu Val 

320 325 330 

GTA TCG CAG CTA CTC CGG ATA CCA CAA GCC GTC GTG GAT ATG GTG GCG 1015 
Val Ser Gin Leu Leu Arg He Pro Gin Ala Val Val Asp Met Val Ala 
335 340 345 

GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG 1103 
Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val 
350 355 360 365 

GGG AAC TGG GCT AAG GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT 1151 
Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val 

370 375 380 

GAC GGG GGG ACC CAC GTG ACA GGG GGA AAG GTA GCC TAC ACC ACC CAG 1199 
Asp Gly Gly Thr His Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin 

385 390 395 

AGC TTT ACA TCC TTC TTT TCA CGA GGG CCG TCT CAG AGA ATC 1247 
Ser Phe Thr Ser Phe Phe Ser Arg Gly Pro Ser Gin Arg He 

400 405 410 

TGAAGATCTG AATTCGC 
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SEQUENCE LENGTH: 483 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: Nl-2 

CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC 60 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGCCCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATC 180 
AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACTGCTAGC CGAGTAGTGT 240 
TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATAGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCATC ATG AGC ACA AAT CCT AAA CCC CAA AGA CAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Gin Thr 
15 10 
AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT 398 
Lys Arg Asn Thr Asn. Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG T 483 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 
45 50 



SEQ ID NO: 10 

SEQUENCE LENGTH: 483 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE! Sl-1 

CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC 60 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGACCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATT 180 
AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACCGCTAGC CGAGTAGTGT 240 
TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATAGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCACC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
15 10 
AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGT GGT 398 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG T 483 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 
45 50 

SEQ ID NO: 11 

SEQUENCE LENGTH: 483 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: Sl-2 

CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC 60 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGACCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATT 180 
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AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACCGCTAGC CGAGTAGTGT 240 
TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATAGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCACC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
1 5 10 

AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGT GGT 398 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 

15 20 25 

GGT CAG ATC GTT GGT GGA GTT TAG CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG T 483 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 
45 50 

20 SEQ ID NO: 12 

SEQUENCE LENGTH : 483 base pairs 

SEQUENCE TYPE j nucleic acid 

STRANDEDNESS: double 
25 TOPOLOGY: linear 

ANT I -SENSE: No 

MOLECULE TYPE: cDNA to genomic RNA 
ORIGINAL SOURCE 
30 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: Sl-3 

35 CTCCACCATA GATCACTCCC CTGTGAGGAA CTACTGTCTT CACGCAGAAA GCGTCTAGCC 60 
ATGGCGTTAG TATGAGTGTC GTGCAGCCTC CAGGACCCCC CCTCCCGGGA GAGCCATAGT 120 
GGTCTGCGGA ACCGGTGAGT ACACCGGAAT TGCCAGGACG ACCGGGTCCT TTCTTGGATT 180 
AACCCGCTCA ATGCCTGGAG ATTTGGGCGT GCCCCCGCGA GACCGCTAGC CGAGTAGTGT 240 

40 TGGGTCGCGA AAGGCCTTGT GGTACTGCCT GATGGGGTGC TTGCGAGTGC CCCGGGAGGT 300 
CTCGTAGACC GTGCACC ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC 350 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr 
1 5 10 

45 AAA CGT AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGT GGT 398 
Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly 
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15 20 25 

GGT CAG ATC GTT GGT GGA GYI TAG CTG TTG CCG CGC AGG GGC CCC AGG 446 
Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg 

30 35 40 

TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG T 483 
Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 
45 50 



SEQ ID NO: 13 

SEQUENCE LENGTH: 339 base pairs 
SEQUENCE TYPE: nucleic acicd 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N27-1 



C CGG ATC CCA CAA GCC GTC GTG GAT ATG GTG GCG GGG GCC CAC TGG GGA 49 
Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 

5 10 15 

GTC CTA GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

20 25 30 

GTC TTG GTT GTG ATG CT& CTC TTC GCC GGT GTT GAC GGG AGG ACC CAC 145 
Val Leu Val Val Met Leu. Leu Phe Ala Gly Val Asp Gly Arg Thr His 

35 40 45 

GTG ACA GGA GGG AAG GTJL GCC TAC ACC ACC CAG AGG TTT ACA TCC TTC 193 
Val Thr Gly Gly Lys VaL Ala Tyr Thr Thr Gin Arg Phe Thr Ser Phe 

50 55 60 

TTT TCA CGA GGG CCG TCC CAG AAA ATC CAA CTT GTA AAC ACT AAC GGC 241 
Phe Ser Arg Gly Pro Ser Gin Lys lie Gin Leu Val Asn Thr Asn Gly 
65 70 75 80 

AGC TGG CAC ATC AAC AGS ACT GCC CTG AAT TGC AAT GAC TCC CTT AAC 289 
Ser Trp His He Asn Arty Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

85 90 95 
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ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC GG 339 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Ph Asn Ala Ser 

100 105 110 

SEQ ID NO: 14 

SEQUENCE LENGTH : 339 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N27-2 

C CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCG GGG GCC CAC TGG GGA 49 
Arg lie Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 

5 10 15 

GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

25 30 
CTG CTT TTC GCC GGT GTT GAC GGG GGG ACC CAC 145 
Leu Leu Phe Ala Gly Val Asp Gly Gly Thr His 

40 45 
GTA GCC TAC ACC ACC CAG AGC TTC ACA TCC TTC 193 
Val Ala Tyr Thr Thr Gin Ser Phe Thr Ser Phe 

55 60 
TCT CAG AGG ATC CAA CTT GTA AAC ACT AAC GGC 241 
Ser Gin Arg lie Gin Leu Val Asn Thr Asn Gly 
70 75 80 

AGG ACT GCC CTG AAT TGC AAT GAC TCC CTT AAC 289 
Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

90 95 
ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC GG 339 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Phe Asn Ala Ser 

100 105 110 



GTC CTG GCG GGC CTT 
Val Leu Ala Gly Leu 

20 

GTC TTG GTT GTG ATG 
Val Leu Val Val Met 
35 

GTG ACA GGG GGG AAG 
Val Thr Gly Gly Lys 
50 

TTT TCA CGA GGG CCG 
Phe Ser Arg Gly Pro 
65 

AGC TGG CAC ATC AAT 
Ser Trp His lie Asn 

85 
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SEQ ID NO: 15 

SEQUENCE LENGTH: 339 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis c virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N27-3 



20 



15 C CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCA GGG GCC CAC TGG GGA 49 

Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 

5 10 15 

GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

20 25 30 

GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC GGG GGG ACC CAC 145 
Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly Gly Thr His 
25 35 40 45 

GTG ACA GGG GGG AAG GTA GCC TAC ACC ACC CAG GGC TTT ACA CCC TTC 193 
Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Gly Phe Thr Pro Phe 

50 55 60 

TTT TCA CGA GGG CCG TCT CAG AAA ATC CAA CTT GTA AAC ACT AAC GGC 241 
Phe Ser Arg Gly Pro Ser Gin Lys He Gin Leu Val Asn Thr Asn Gly 
65 70 75 80 

AGC TGG CAC ATC AAT AGG ACT GCC CTC AAT TGC AAT GAC TCC CTT AAC 289 
Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

85 90 95 

ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC <SG 339 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Phe Asn Ala Ser 
40 100 105 HO 



30 



35 



SEQ ID NO: 16 

SEQUENCE LENGTH: 393 base pairs 
45 SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
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TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N19-1 



GAG GCC GTG AAC TGC GAT GAC TCC CTT AAC ACC GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Asn Thr Gly Phe Leu Ala Ala 

15 10 15 

CTG TTC TAG ACG CAC AGG TTC AAC GCG TCC GGA TGT CCG GAG CGT ATG 96 
Leu Phe Tyr Thr His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC GGT TGC CGC CCC ATT GAC GAG TTC GCT CAG GGG TGG GGT CCC ATC 144 
Ala Gly Cys Arg Pro lie Asp Glu Phe Ala Gin Gly Trp Gly Pro He 

35 40 45 

ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG AGG CCC TAT TGC TGG CAC 192 
Thr His Val Val Pro Asn He Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
65 70 75 80 

CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

CGT TTC GGC GCC CCC ACG TAC AAC TGG GGA AAC AAT GAG ACG GAT GTG 336 
Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly Asn Asn Glu Thr Asp Val 
35 100 105 110 

CTA CTC CTC AAC AAC ACA CGG CCG CCG CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 
115 120 125 

40 ACC TGG ATG 393 

Thr Trp Met 
130 

45 SEQ ID NO: 17 

SEQUENCE LENGTH: 393 base pairs 
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SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N19-2 



GAG GCC GTG AAC TGC GAT GAC TCC CTT AAC ACC GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Asn Thr Gly Phe Leu Ala Ala 
15 I 5 10 15 

CTG TTC TAC ACG CAC AGG TTC AAC GCG TCC GGA TGT CCG GAG CGT ATG 96 
Leu Phe Tyr Thr His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC AGT TGC CGC CCC ATT GAC GAG TTC GCT CAG GGG TGG GGT CCC ATC 144 
Ala Ser Cys Arg Pro He Asp Glu Phe Ala Gin Gly Trp Gly Pro He 

35 40 45 

ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG AGG CCC TAT TGC TGG CAC 192 
Thr His Val Val Pro Asn He Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
65 70 75 80 

CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

CGT TTC GGC GCC CCC ACG TAT AAC TGG GGG AAC AAT GAG ACG GAT GTG 336 
Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly Asn Asn Glu Thr Asp Val 

100 105 110 

CTA CTC CTC AAC AAC AGA CGG CCG CCG CAA GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 393 
Thr Trp Met 
130 

45 



50 



55 



94 



EP 0 518 313 A2 



SEQ ID NO: 18 

SEQUENCE LENGTH: 393 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY j linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N19-3 



75 



20 



25 



30 



35 



40 



45 



GAG 
Glu 
1 

CTG 
Leu 

GCC 
Ala 

ACT 
Thr 

TAG 
Tyr 
65 
CCG 
Pro 

CGT 
Arg 

CTA 
Leu 

ACC 
Thr 



GCC GTG 
Ala Val 

TTC TAC 
Phe Tyr 

AGT TGC 
Ser Cys 
35 

CAT GTT 
His Val 

50 
GCG CCT 
Ala Pro 

GTG TAT 
Val Tyr 

TTC GGC 
Phe Gly 

CTC CTC 
Leu Leu 
115 
TGG ATG 
Trp Met 



AAC 
Asn 

ACG 
Thr 
20 
CGC 
Arg 



TGC 
Cys 
5 

CAC 
His 

CCC 
Pro 



GAT GAC TCC 
Asp Asp Ser 

AGG TTC AAC 
Arg Phe Asn 



GTG CCT 
Val Pro 

CGA CCG 
Arg Pro 

TGC TTC 

Cys Phe 
85 

GCC CCC 
Ala Pro 
100 

AAC AAC 
Asn Asn 



ATT 
He 

AAC 
Asn 

TGT 
Cys 
70 
ACC 
Thr 



GAT 
Asp 

ATC 
He 
55 
GGT 
Gly 



GAG 
Glu 
40 
TCG 
Ser 

ATC 

He 



CTT 
Leu 

GCG 
Ala 
25 
TTC 
Phe 



AAC 
Asn 
10 
TCC 
Ser 

GCT 
Ala 



CCA AGC 
Pro Ser 



ACG TAT AAC 
Thr Tyr Asn 

ACA CGG CCG 
Thr Arg Pro 
120 



GAC CAG 
Asp Gin 

GTA CCC 
Val Pro 

CCT GTT 
Pro Val 
90 

TGG GGG 
Trp Gly 
105 

CCG CAA 
Pro Gin 



ACC GGG 
Thr Gly 

GGA TGT 
Gly Cys 

CAG GGG 
Gin Gly 

AGG CCC 
Arg Pro 
60 

GCG TGG 
Ala Trp 

75 
GTG GTG 
Val Val 

AAC AAT 
Asn Asn 

GGC AAC 
Gly Asn 



TTC CTT 
Phe Leu 

CCG GAG 
Pro Glu 
30 

TGG GGT 
Trp Gly 

45 
TAT TGC 
Tyr Cys 

CAG GTG 
Gin Val 

GGG ACG 
Gly Thr 

GAG ACG 
Glu Thr 
110 
TGG TTC 
Trp Phe 
125 



GCC GCG 
Ala Ala 

15 
CGT ATG 
Arg Met 

CCC ATC 
Pro He 

TGG CAC 
Trp His 

TGT GGT 
Cys Gly 
80 

ACC GAT 
Thr Asp 

95 
GAT GTG 
Asp Val 

GGT TGT 
Gly Cys 



48 



96 



144 



192 



240 



288 



336 



384 



393 



50 



55 



95 
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10 



130 

SEQ ID NO: 19 

SEQUENCE LENGTHt 393 base pairs 
SEQUENCE TYPE j nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
75 CLONE: H19-2 

GAG GCC GTG AAC TGC GAT GAC TCC CTC GAG ACT GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Ala 
20 1 5 10 15 

CTG TTC TAC AGG CAC AGG TTC AAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

25 GCC AGC TGC CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GGT CCT ATC 144 

Ala Ser Cys Arg Pro He Ser Glu Phe Ala Gin Gly Trp Gly Pro He 

35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 
30 Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCG CCT CGA CCG TGC GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
35 65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

40 CGT TCT GGC GCC CCC ACG TAC ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 

Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 

100 105 110 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 
45 Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 



50 



55 



96 
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10 



15 



20 



25 



ACC TGG ATG 
Thr Trp Met 
130 

SEQ ID NO: 20 

SEQUENCE LENGTH: 393 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: H19-4 



393 



GAG GCC GTG AAC TGC GAT GAC TCC CTC CAG ACT GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Ala 

1 5 10 15 

CTG TTC TAC AGG CAC AGG TTC AAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC AGC TGT CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GGC CCT ATC 144 
Ala Ser Cys Arg Pro He Ser Glu Phe Ala Gin Gly Trp Gly Pro He 
30 35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 
Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 
50 55 60 

35 TAC GCG CCT CGA CCG TGC GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 

Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 
40 Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

CGC TCT GGC GCC CCC ACG TAC ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 
Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 
45 100 105 no 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 



50 



55 



97 
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393 



Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 
Thr Trp Met 
130 

SEQ ID NO: 21 

SEQUENCE LENGTH t 393 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
75 ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
20 CLONE: H19-10 

GAG GCC GTG AAC TGC GAT GAC TCC CTC CAG ACT GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Ala 

25 1 5 10 15 

CTG TTC TAC AGG CAC AGG TTC AAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

30 GCC AGC TGC CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GGC CCT ATC 144 

Ala Ser Cys Arg Pro He Ser Glu Phe Ala Gin Gly Trp Gly Pro lie 

35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 

35 Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCA CCT CGA CCG TGC GGT GTC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly Val Val Pro Ala Ser Gin Val Cys Gly 

40 65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

45 CGC TCT GGC GCC CCC ACG TAC ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 

Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 



50 



55 



98 
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w 



15 



20 



25 



30 



40 



45 



393 



100 105 no 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 

Thr Trp Met 

130 

SEQ ID NO: 22 

SEQUENCE LENGTH: 393 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: Y19-4 



GAG GCC GTG AAC TGC GAT GAC TCC CTC CAG ACT GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Ala 

1 5 10 15 

CTG TTC TAG AGG CAC AGG TTC AAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC AGC TGT CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GGC CCT ATC 144 
Ala Ser Cys Arg Pro He Ser Glu Phe Ala Gin Gly Trp Gly Pro He 
35 35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 
Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCG CCT CGA CCG TGC GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 



50 



55 



99 
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w 



15 



35 



40 



CGC TCT GGC GCC CCC ACG TAG ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 
Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 

100 105 no 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 
Thr Trp Met 
130 



393 



SEQ ID NO: 23 

SEQUENCE LENGTH: 393 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
20 ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
25 CLONE: Y19-6 

GAG GCC GTG AAC TGC GAT GAC TCC CTC CAG ACT GGG TTC CTT GCC ACG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Thr 
30 1 5 10 15 

CTG TTC TAC AGG CAC AGG TTC AAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC AGC TGT CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GAC CCT ATC 144 
Ala Ser Cys Arg Pro He Ser Glu Phe Ala Gin Gly Trp Asp Pro He 

35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 
Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 

50 55 60 

TAC GCG CCT CGA CCG TGC GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
45 65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 



50 



55 



100 
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10 



15 



35 



Pro Val Tyr Cys Phe Thr Pro Ser Pro Val val Val Gly Thr Thr Asp 

85 90 95 

CGC TCT GGC GCC CCC ACG TAG ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 
Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 

100 105 110 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 393 
Thr Trp Met 
130 



SEQ ID MO: 24 

SEQUENCE LENGTH: 393 base pairs 

SEQUENCE TYPE: nucleic acid 
20 STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
25 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: Y19-7 



30 



GAG GCC GTG AAC TGC GAT GAC TCC CTC CAG ACT GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Gin Thr Gly Phe Leu Ala Ala 

1 5 10 15 

CTG TTC TAC AGG CAT AGG TTC GAC GCA TCC GGG TGC CCA GAA CGC ATG 96 
Leu Phe Tyr Arg His Arg Phe Asp Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC AGC TGT CGC CCC ATT AGC GAG TTC GCT CAG GGG TGG GGC CCT ATC 144 
Ala Ser Cys Arg Pro lie Ser Glu Phe Ala Gin Gly Trp Gly Pro lie 
40 35 40 45 

ACT CAT GTT GTG CCT GAC GTG TCG GAC CAG AGG CCT TAT TGC TGG CAC 192 
Thr His Val Val Pro Asp Val Ser Asp Gin Arg Pro Tyr Cys Trp His 
50 55 60 

45 TAC GCG CCT CGA CCG TGC GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 

Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 



so 



55 



101 
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10 



15 



20 



25 



65 70 75 80 

CCA GTG TAT TGC TTC ACC CCA AGC CCT GTC GTG GTG GGG ACG ACC GAT 288 
Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

CGC TCT GGC GCC CCC ACG TAC ACC TGG GGG GCG AAT GAG ACG GAC GTG 336 
Arg Ser Gly Ala Pro Thr Tyr Thr Trp Gly Ala Asn Glu Thr Asp Val 

100 105 110 

CTA CTC CTT AAC AAC ACG CGT CCG CCA CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG 
Thr Trp Met 
130 



393 



SEQ ID NO:25 

SEQUENCE LENGTH* 629 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
30 CLONE: MX24-4 

AAC ACA CGG CCG CCG CAG GGG AAC TGG TTT GGC TGT ACA TGG ATG AAT 48 
Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys Thr Trp Met Asn 
35 1 5 10 15 

GGC ACT GGG TTC ACA AAG ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG 96 
Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly 

20 25 30 

40 GGG GTC GGC AAC AAT ACC TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG 144 

Gly Val Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys 

35 40 45 

CAC CCC GAG GCC ACT TAC ACA AAA TGT GGT TCG GGG CCT TGG TTG ACG 192 
45 His Pro Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr 

50 55 60 



50 



55 
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10 



15 



20 



25 



30 



35 



CCT AGG TGC CTA GTT CAT TAG CCA TAC AGG CTC TGG CAC TAT CCC TGC 240 

Pro Arg Cys lieu Val His Tyr Pro Tyr Arg Lou Trp His Tyr Pro Cys 
65 70 75 80 

ACT GTC AAC TTT ACC ATC TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG 288 
Thr Val Asn Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val 

85 90 95 

GAA CAC AGG CTT GAA GCT GCA TGC AAT TGG ACC CGA GGA GAG CGT TGT 336 
Glu His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys 

100 105 110 

GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTA TTG CTG TCC 384 
Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser 

115 120 125 

ACA ACA GAG TGG CAG GTA CTG CCC TGT TCC TTC ACC ACC CTG CCG GCT 432 
Thr Thr Glu Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala 

130 135 140 

CTG TCC ACT GGT TTG ATT CAT CTC CAT CAG AAC ATC GTG GAC GTG CAA 480 
Leu Ser Thr Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin 
145 150 155 160 

TAT CTG TAC GGC ATA GGG TCG GCG GTT GTC TCC TTC GCA ATC AAA TGG 528 
Tyr Leu Tyr Gly lie Gly Ser Ala Val Val Ser Phe Ala lie Lys Trp 

165 170 175 

GAA TAT ATT CTG TTG CTT TTC CTC CTC CTG GCG GAC GCG CGC GTC TGT 576 
Glu Tyr lie Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 

180 185 190 

GCC TGC TTG TGG ATG ATG CTG CTG ATA GCC CAC GCC GAC GCC ACC TTA 624 
Ala Cys Leu Trp Met Met Leu Leu lie Ala His Ala Asp Ala Thr Leu 

195 200 205 

GAG AA 629 
Glu 



SEQ ID NO: 26 
40 SEQUENCE LENGTH: 629 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
45 ANTI-SENSE: No 

ORIGINAL SOURCE 



50 



55 



103 
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10 



15 



20 



ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: MX24-5 

AAC ACA CGG CCG CCG CAG GGG AAC TGG TTT GGC TGT ACA TGG ATG AAT 48 
Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys Thr Trp Met Asn 

1 5 10 15 

GGC ACT GGG TTC ACA AAG ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG 96 
Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly 

20 25 30 

GGG GTC GGC AAC AAT ACC TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG 144 
Gly Val Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys 

35 40 45 

CAC CCC GAG GCC ACT TAC ACA AAA TGT GGT TCG GGG CCT TGG TTG ACG 192 
His Pro Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr 

50 55 60 

CCT AGG TGC CTA GTT CAT TAC CCA TAC AGG CTC TGG CAC TAT CCC TGC 240 
Pro Arg Cys Leu Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
65 70 75 80 

ACT GTC AAC TTT ACC ATC TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG 288 
Thr Val Asn Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val 

85 90 95 

GAA CAC AGG CTT GAA GCT GCA TGC AAT TGG ACC CGA GGA GAG CGT TGT 336 
Glu His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys 

100 105 110 

GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTA TTG CTG TCT 384 
Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser 
35 115 120 125 

ACA ACA GAG TGG CAG GTA CTG CCC TGT TCC TTC ACC ACC CTG CCG GCT 432 
Thr Thr Glu Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala 
130 135 140 

40 CTG TCC ACT GGT TTG ATT CAT CTC CAT CAG AAC ATC GTG GAC GTG CAA 480 

Leu Ser Thr Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin 
145 150 155 160 

TAT TTG TAC GGC ATA GGG TCG GCG GTT GTC TCC TTC GCA ATC AAA TGG 528 
45 Tyr Leu Tyr Gly He Gly Ser Ala Val Val Ser Phe Ala He Lys Trp 

165 170 175 



25 



30 



50 



55 
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GAA TAT ATT CTG TTG CTT TTC CTT CTC CTG GCG GAC GCG CGC GTC TGT 576 
Glu Tyr lie Leu lieu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 

180 185 190 

GCC TGC TTG TGG ATG ATG CTG CTG ATA GCC CAC GCC GAC GCC ACC TTA 624 
Ala Cys Leu Trp Met Met Leu Leu lie Ala His Ala Asp Ala Thr Leu 

195 200 205 

GAG AA 529 
Glu 



SBQ ID NO: 27 

SEQUENCE LENGTH: 629 base pairs 
15 SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 
20 ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: MX24-13 

25 

AAC ACA CGG CCG CCG CAG GGG AAC TGG TTT GGC TGT ACA TGG ATG AAT 48 
Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys Thr Trp Met Asn 
15 10 15 

30 GGC ACT GGG TTC ACA AAG ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG 96 

Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly 

20 25 3Q 

GGG GTC GGC AAC AAT ACC TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG 144 
35 Gly Val Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys 

35 40 45 

CAC CCC GAG GCC ACT TAG ACA AAA TGT GGT TCG GGG CCT TGG CTG ACG 192 
His Pro Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr 
40 50 55 60 

CCT AGG TGC CTA GTT CAT TAG CCA TAG AGG CTC TGG CAC TAT CCC TGC 240 
Pro Arg Cys Leu Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
65 70 75 80 

45 ACT GTC AAC TTT ACC ATC TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG 288 

Thr Val Asn Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val 



50 



55 
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EP 0 518 313 A2 



GAA CAC 
Glu His 

GAC TTG 
Asp Leu 



ACA 
Thr 

CTG 
Leu 
145 
TAT 
Tyr 



ACA 
Thr 
130 
TCC 
Ser 

CTG 
Leu 



GAA TAT 
Glu Tyr 

GCC TGC 
Ala Cys 

GAG AA 
Glu 



85 

AGG CTT GAA GCT 
Arg Leu Glu Ala 
100 

GAG GAC AGG GAT 
Glu Asp Arg Asp 
115 

GAG TGG CAG GTA 
Glu Trp Gin Val 

ACT GGT TTG ATT 
Thr Gly Leu lie 

150 

TAC GGC ATA GGG 
Tyr Gly lie Gly 
165 

ATT CTG TTG CTT 
lie Leu Leu Leu 
180 

TTG TGG ATG ATG 
Leu Trp Met Met 
195 



GCA TGC 
Ala Cys 



AGA 
Arg 

CTG 
Leu 
135 
CAT 
His 



TCA 
Ser 
120 
CCC 
Pro 

CTC 
Leu 



90 

AAT TGG 
Asn Trp 
105 

GAG CTT 
Glu Leu 

TGT TCC 
Cys Ser 

CAT CAG 
His Gin 



TCG GCG 
Ser Ala 

TTC CTT 
Phe Leu 

CTG CTG 
Leu Leu 
200 



GTT 
Val 

CTC 
Leu 
185 
ATA 
lie 



GTC 
Val 
170 
CTG 
Leu 

GCC 
Ala 



ACC CGC GGA GAG 
Thr Arg Gly Glu 

110 

AGC CCG CTA TTG 
Ser Pro Leu Leu 
125 

TTC ACC ACC CTG 

Phe Thr Thr Leu 
140 

AAC ATC GTG GAC 
Asn lie Val Asp 
155 

TCC TTC GCA ATC 
Ser Phe Ala He 

GCG GAC GCA CGC 
Ala Asp Ala Arg 

190 

CAC GCC GAC GCC 
His Ala Asp Ala 
205 



95 
CGT TGT 
Arg Cys 

CTG TCT 
Leu Ser 

CCG GCT 
Pro Ala 

GTG CAA 
Val Gin 
160 
AAA TGG 
Lys Trp 
175 

GTC TGT 
Val Cys 

ACC TTA 
Thr Leu 



336 



384 



432 



480 



528 



576 



624 



629 



SEQ ID NO: 28 

SEQUENCE LENGTH: 652 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N27N19-1 



C CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCA GGG GCC CAC TGG GGA 49 



106 
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10 



15 



20 



25 



30 



35 



40 



45 



Arg lie Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 
15 10 15 

GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

20 25 30 

GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC GGG GGG ACC CAC 145 
Val Leu Val Val Met Leu Leu P(e Ala GLy VA1 Aep Gly Gly THr His 

35 40 45 

GTG ACA GGG GGG AAG GTA GCC TAC ACC ACC CAG GGC TTT ACA CCC TTC 193 
Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Gly Phe Thr Pro Phe 

50 55 60 

TTT TCA CGA GGG CCG TCT CAG AAA ATC CAA CTT GTA AAC ACT AAC GGC 241 
Phe Ser Arg Gly Pro Ser Gin Lys He Gin Leu Val Asn Thr Asn Gly 
€5 70 75 80 

AGC TGG CAC ATC AAT AGG ACT GCC CTC AAT TGC AAT GAC TCC CTT AAC 289 
Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

85 90 95 

ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC 337 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Phe Asn Ala Ser 

100 105 110 

GGA TGT CCG GAG CGT ATG GCC GGT TGC CGC CCC ATT GAC GAG TTC GCT 385 
Gly Cys Pro Glu Arg Met Ala Gly Cys Arg Pro He Asp Glu Phe Ala 

115 120 125 

CAG GGG TGG GGT CCC ATC ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG 433 
Gin Gly Trp Gly Pro He Thr His Val Val Pro Asn He Ser Asp Gin 

130 135 140 

AGG CCC TAT TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC 481 
Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro 
145 150 155 160 

GCG TCG CAG GTG TGT GGT CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT 529 
Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 

165 170 175 

GTG GTG GGG ACG ACC GAT CGT TTC GGC GCC CCC ACG TAC AAC TGG GGA 577 
Val Val Gly Thr Thr Asp Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly 

180 185 190 

AAC AAT GAG ACG GAT GTG CTA CTC CTC AAC AAC ACA CGG CCG CCG CAG 625 
Asn Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin 
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70 



15 



20 



25 



30 



45 



195 200 205 

GGC AAC TGG TTC GGT TGT ACC TGG ATG 652 
Gly Asn Trp Phe Gly Cys Thr Trp Met 
210 215 

SEQ ID NO: 29 

SEQUENCE LENGTH: 977 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N19MX24A-1 



GAG GCC GTG AAC TGC GAT GAC TCC CTT AAC ACC GGG TTC CTT GCC GCG 48 
Glu Ala Val Asn Cys Asp Asp Ser Leu Asn Thr Gly Phe Leu Ala Ala 

15 10 15 

CTG TTC TAG ACG CAC AGG TTC AAC GCG TCC GGA TGT CCG GAG CGT ATG 96 
Leu Phe Tyr Thr His Arg Phe Asn Ala Ser Gly Cys Pro Glu Arg Met 

20 25 30 

GCC GGT TGC CGC CCC ATT GAC GAG TTC GCT CAG GGG TGG GGT CCC ATC 144 
Ala Gly Cys Arg Pro He Asp Glu Phe Ala Gin Gly Trp Gly Pro He 

35 40 45 

ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG AGG CCC TAT TGC TGG CAC 192 

Thr His Val Val Pro Asn He Ser Asp Gin Arg Pro Tyr Cys Trp His 
35 50 55 60 

TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT 240 
Tyr Ala Pro Arg Pro Cys Gly He Val Pro Ala Ser Gin Val Cys Gly 
65 70 75 80 

40 CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT GTG GTG GGG ACG ACC GAT 288 

Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp 

85 90 95 

CGT TTC GGC GCC CCC ACG TAC AAC TGG GGA AAC AAT GAG ACG GAT GTG 336 
Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly Asn Asn Glu Thr Asp Val 

100 105 110 
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70 



75 



20 



CTA CTC CTC AAC AAC ACA CGG CC6 CCG CAG GGC AAC TGG TTC GGT TGT 384 
Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys 

115 120 125 

ACC TGG ATG AAT GGC ACT GGG TTC ACA AAG ACG TGC GGG GGC CCC CCG 432 
Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro 

130 135 140 

TGC AAC ATC GGG GGG GTC GGC AAC AAT ACC TTG ACT TGC CCC ACG GAC 480 
Cys Asn lie Gly Gly Val Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp 
145 150 155 160 

TGC TTC CGG AAG CAC CCC GAG GCC ACT TAC ACA AAA TGT GGT TCG GGG 528 
Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly 

165 170 175 

CCT TGG TTG ACG CCT AGG TGC CTA GTT CAT TAC CCA TAC AGG CTC TGG 576 
Pro Trp Leu Thr Pro Arg Cys lieu Val His Tyr Pro Tyr Arg Leu Trp 

180 185 190 

CAC TAT CCC TGC ACT GTC AAC TTT ACC ATC TTC AAG GTT AGG ATG TAT 624 
His Tyr Pro Cys Thr Val Asn Phe Thr lie Phe Lys Val Arg Met: Tyr 

195 200 205 

GTG GGG GGC GTG GAA CAC AGG CTT GAA GCT GCA TGC AAT TGG ACC CGA 672 
Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg 

210 215 220 

GGA GAG CGT TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG 720 
Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro 
225 230 235 240 

CTA TTG CTG TCC ACA ACA GAG TGG CAG GTA CTG CCC TGT TCC TTC ACC 768 
Leu Leu Leu Ser Thr Thr Glu Trp Gin Val Leu Pro Cys Ser Phe Thr 

245 250 255 

ACC CTG CCG GCT CTG TCC ACT GGT TTG ATT CAT CTC CAT CAG AAC ATC 816 
Thr Leu Pro Ala Leu Ser Thr Gly Leu lie His Leu His Gin Asn lie 

260 265 270 

GTG GAC GTG CAA TAT CTG TAC GGC ATA GGG TCG GCG GTT GTC TCC TTC 864 
Val Asp Val Gin Tyr Leu Tyr Gly He Gly Ser Ala Val Val Ser Phe 

275 280 285 

GCA ATC AAA TGG GAA TAT ATT CTG TTG CTT TTC CTC CTC CTG GCG GAC 912 
Ala He Lys Trp Glu Tyr He Leu Leu Leu Phe Leu Leu Leu Ala Asp 
45 290 295 300 

GCG CGC GTC TGT GCC TGC TTG TGG ATG ATG CTG CTG ATA GCC CAC GCC 960 
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Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu He Ala His Ala 
305 310 315 320 

GAC GCC ACC TTA GAG AA 
Asp Ala Thr Leu Glu 

325 



SEQ ID NO: 30 

SEQUENCE LENGTHj 977 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N19MX24B-1 



25 



30 



35 



40 



GAG GCC 
Glu Ala 
1 

CTG TTC 
Leu Phe 

GCC GGT 
Ala Gly 

ACT CAT 
Thr His 
50 

TAG GCG 
Tyr Ala 

65 
CCG GTG 
Pro Val 



GTG AAC 
Val Asn 



TAC 
Tyr 

TGC 
Cys 
35 
GTT 
Val 



ACG 
Thr 
20 
CGC 
Arg 

GTG 
Val 



CCT CGA 
Pro Arg 

TAT TGC 
Tyr Cys 



45 



CGT TTC GGC GCC 
Arg Phe Gly Ala 



TGC GAT 
Cys Asp 
5 

CAC AGG 
His Arg 

CCC ATT 
Pro lie 

CCT AAC 
Pro Asn 

CCG TGT 
Pro Cys 
70 

TTC ACC 
Phe Thr 

85 
CCC ACG 
Pro Thr 



GAC TCC 
Asp Ser 

TTC AAC 
Phe Asn 

GAC GAG 
Asp Glu 
40 

ATC TCG 
lie Ser 

55 
GGT ATC 
Gly He 

CCA AGC 
Pro Ser 

TAC AAC 
Tyr Asn 



CTT 
Leu 

GCG 
Ala 
25 
TTC 
Phe 



AAC 
Asn 
10 
TCC 
Ser 

GCT 
Ala 



GAC CAG 
Asp Gin 

GTA CCC 
Val Pro 

CCT GTT 
Pro Val 
90 

TGG GGA 
Trp Gly 



ACC GGG TTC 
Thr Gly Phe 

GGA TGT CCG 
Gly Cys Pro 

CAG GGG TGG 
Gin Gly Trp 
45 

AGG CCC TAT 
Arg Pro Tyr 
60 

GCG TCG CAG 
Ala Ser Gin 
75 

GTG GTG GGG 
Val Val Gly 

AAC AAT GAG 
Asn Asn Glu 



CTT GCC GCG 48 
Leu Ala Ala 
15 

GAG CGT ATG 96 
Glu Arg Met 
30 

GGT CCC ATC 144 
Gly Pro He 

TGC TGG CAC 192 
Cys Trp His 

GTG TGT GGT 240 
Val Cys Gly 
80 

ACG ACC GAT 288 
Thr Thr Asp 
95 

ACG GAT GTG 336 
Thr Asp Val 
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CTA CTC 
Leu Leu 



10 



15 



20 



25 



30 



35 



40 



ACA 
Thr 

TGC 
Cys 
145 
TGC 
Cys 



TGG 
Trp 
130 
AAC 
Asn 

TTC 
Phe 



45 



CCT TGG TTG 
Pro Trp Leu 

CAC TAT CCC 
His Tyr Pro 
195 

GTG GGG GGC 
Val Gly Gly 
210 

GGA GAG CGT 
Gly Glu Arg 
225 

CTA TTG CTG 
Leu Leu Leu 

ACC CTG CCG 
Thr Leu Pro 

GTG GAC GTG 
Val Asp Val 
275 

GCA ATC AAA 
Ala He Lys 
290 



100 
CTC AAC AAC 
Leu Asn Asn 
115 

ATG AAT GGC 
Met Asn Gly 

ATC GGG GGG 
He Gly Gly 

CGG AAG CAC 
Arg Lys His 
165 
ACG CCT 
Thr Pro 
180 

TGC ACT 
Cys Thr 

GTG GAA 
Val Glu 

TGT GAC 
Cys Asp 



TCC 
Ser 

GCT 
Ala 
260 
CAA 
Gin 



ACA 
Thr 
245 
CTG 
Leu 

TAT 
Tyr 



TGG GAA 
Trp Glu 



ACA CGG 
Thr Arg 

ACT GGG 
Thr Gly 
135 
GTC GGC 
Val Gly 
150 

CCC GAG 
Pro Glu 

AGG TGC 
Arg Cys 

GTC AAC 
Val Asn 

CAC AGG 
His Arg 
215 
TTG GAG 
Leu Glu 
230 

ACA GAG 
Thr Glu 

TCC ACT 
Ser Thr 

CTG TAC 
Leu Tyr 

TAT ATT 
Tyr He 
295 



105 

CCG CCG CAG 
Pro Pro Gin 
120 

TTC ACA AAG 
Phe Thr Lys 

AAC AAT ACC 
Asn Asn Thr 



GCC 
Ala 

CTA 
Leu 

TTT 
Phe 
200 
CTT 
Leu 



ACT TAC 
Thr Tyr 
170 
GTT CAT 
Val His 
185 

ACC ATC 
Thr He 

GAA GCT 
Glu Ala 



GAC AGG GAT 
Asp Arg Asp 

TGG CAG GTA 
Trp Gin Val 
250 

GGT TTG ATT 
Gly Leu He 
265 

GGC ATA GGG 
Gly He Gly 
280 

CTG TTG CTT 
Leu Leu Leu 



GGG 
Gly 

ACG 
Thr 

TTG 
Leu 
155 
ACA 
Thr 

TAC 
Tyr 

TTC 
Phe 

GCA 
Ala 

AGA 
Arg 
235 
CTG 
Leu 

CAT 
His 

TCG 
Ser 

TTC 
Phe 



AAC TGG 
Asn Trp 
125 
TGC GGG 
Cys Gly 
140 

ACT TGC 
Thr Cys 

AAA TGT 
Lys Cys 

CCA TAC 
Pro Tyr 



AAG 
Lys 

TGC 
Cys 
220 
TCA 
Ser 



GTT 
Val 
205 
AAT 
Asn 

GAG 
Glu 



CCC TGT 
Pro Cys 

CTC CAT 
Leu His 

GCG GTT 
Ala Val 
285 
CTC CTC 
Leu Leu 
300 



110 

TTT GGC TGT 384 
Phe Gly Cys 

GGC CCC CCG 432 
Gly Pro Pro 

CCC ACG GAC 480 
Pro Thr Asp 
160 

GGT TCG GGG 528 
Gly Ser Gly 
175 

AGG CTC TGG 576 
Arg Leu Trp 
190 

AGG ATG TAT 624 
Arg Met Tyr 

TGG ACC CGA 672 
Trp Thr Arg 

CTT AGC CCG 720 
Leu Ser Pro 
240 

TCC TTC ACC 768 
Ser Phe Thr 
255 

CAG AAC ATC 816 
Gin Asn He 
270 

GTC TCC TTC 864 
Val Ser Phe 

CTG GCG GAC 912 
Leu Ala Asp 
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10 



50 



GCG CGC GTC TGT GCC TGC TTG TGG ATG ATG CTG CTG ATA GCC CAC GCC 960 
Ala Arg Val Cys Ala Cys Leu Trp Met Met Leu Leu He Ala His Ala 
305 310 315 320 

GAC GCC ACC TTA GAG AA 977 

Asp Ala Thr Leu Glu 

325 



SEQ ID NO: 31 

SEQUENCE LENGTH: 1236 base pairs 

SEQUENCE TYPE: nucleic acid 
75 STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
20 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N27MX24A-1 

25 C CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCA GGG GCC CAC TGG GGA 49 

Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 
15 10 15 

GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
30 Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

20 25 30 

GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC GGG GGG ACC CAC 145 
Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly Gly Thr His 
35 35 40 45 

GTG ACA GGG GGG AAG GTA GCC TAC ACC ACC CAG GGC TTT ACA CCC TTC 193 
Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Gly Phe Thr Pro Phe 
50 55 60 

40 TTT TCA CGA GGG CCG TCT CAG AAA ATC CAA CTT GTA AAC ACT AAC GGC 241 

Phe Ser Arg Gly Pro Ser Gin Lys He Gin Leu Val Asn Thr Asn Gly 
65 70 75 80 

AGC TGG CAC ATC AAT AGG ACT GCC CTC AAT TGC AAT GAC TCC CTT AAC 289 
45 Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

85 90 95 



55 



112 



EP 0 518 313 A2 



10 



15 



20 



ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC 337 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Phe Asn Ala Ser 

100 105 110 

GGA TGT CCG GAG CGT ATG GCC GGT TGC CGC CCC ATT GAC GAG TTC GCT 385 
Gly Cys Pro Glu Arg Met Ala Gly Cys Arg Pro He Asp Glu Phe Ala 

115 120 125 

CAG GGG TGG GGT CCC ATC ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG 433 
Gin Gly Trp Gly Pro He Thr His Val Val Pro Asn He Ser Asp Gin 

130 135 140 

AGG CCC TAT TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC 481 
Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro 
145 150 155 160 

GCG TCG CAG GTG TGT GGT CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT 529 
Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 

165 170 175 

GTG GTG GGG ACG ACC GAT CGT TTC GGC GCC CCC ACG TAC AAC TGG GGA 577 
Val Val Gly Thr Thr Asp Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly 

180 185 190 

25 AAC AAT GAG ACG GAT GTG CTA CTC CTC AAC AAC ACA CGG CCG CCG CAG 625 
Asn Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin 

195 200 205 

GGC AAC TGG TTC GGT TGT ACC TGG ATG AAT GGC ACT GGG TTC ACA AAG 673 
30 Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys 
210 215 220 

ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG GGG GTC GGC AAC AAT ACC 721 
Thr Cys Gly Gly Pro Pro Cys Asn He Gly Gly Val Gly Asn Asn Thr 
35 225 230 235 240 

TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG CAC CCC GAG GCC ACT TAC 769 
Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr 

245 250 255 

ACA AAA TGT GGT TCG GGG CCT TGG TTG ACG CCT AGG TGC CTA GTT CAT 817 
Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu Val His 

260 265 270 

TAC CCA TAC AGG CTC TGG CAC TAT CCC TGC ACT GTC AAC TTT ACC ATC 865 
Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

275 280 285 

TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG GAA CAC AGG CTT GAA GCT 913 

50 



40 



45 



55 
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Phe Lys Val Arg Met 
290 

GCA TGC AAT TGG ACC 
Ala Cys Asn Trp Thr 
305 

AGA TCA GAG CTT AGC 
Arg Ser Glu Leu Ser 

325 

CTG CCC TGT TCC TTC 
Leu Pro Cys Ser Phe 

340 

CAT CTC CAT CAG AAC 
His Leu His Gin Asn 
355 

TCG GCG GTT GTC TCC 
Ser Ala Val Val Ser 
370 

TTC CTC CTC CTG GCG 
Phe Leu Leu Leu Ala 
385 

CTG CTG ATA GCC CAC 
Leu Leu He Ala His 

405 



Tyr Val Gly Gly 
295 

CGA GGA GAG CGT 
Arg Gly Glu Arg 
310 

CCG CTA TTG CTG 
Pro Leu Leu Leu 

ACC ACC CTG CCG 
Thr Thr Leu Pro 

345 

ATC GTG GAC GTG 
He Val Asp Val 
360 

TTC GCA ATC AAA 
Phe Ala He Lys 
375 

GAC GCG CGC GTC 
Asp Ala Arg Val 
390 

GCC GAC GCC ACC 
Ala Asp Ala Thr 



Val Glu His Arg 
300 

TGT GAC TTG GAG 
Cys Asp Leu Glu 
315 

TCC ACA ACA GAG 
Ser Thr Thr Glu 
330 

GCT CTG TCC ACT 
Ala Leu Ser Thr 

CAA TAT CTG TAC 
Gin Tyr Leu Tyr 

365 

TGG GAA TAT ATT 
Trp Glu Tyr He 
380 

TGT GCC TGC TTG 
Cys Ala Cys Leu 

395 
TTA GAG AA 
Leu Glu 
410 



Leu Glu Ala 

GAC AGG GAT 961 
Asp Arg Asp 
320 

TGG CAG GTA 1009 
Trp Gin Val 
335 

GGT TTG ATT 1057 
Gly Leu He 
350 

GGC ATA GGG 1105 
Gly He Gly 

CTG TTG CTT 1153 
Leu Leu Leu 

TGG ATG ATG 1201 
Trp Met Met 
400 

1236 



SEQ ID NO: 32 

SEQUENCE LENGTH: 1236 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N27MX24B-1 

C CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCA GGG GCC CAC TGG GGA 49 
Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 
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w 



15 



20 



15 10 15 

GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG 97 
Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 

20 25 30 

GTC TTG GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC GGG GGG ACC CAC 145 
Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly Gly Thr His 

35 40 45 

GTG ACA GGG GGG AAG GTA GCC TAC ACC ACC CAG GGC TTT ACA CCC TTC 193 
Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Gly Phe Thr Pro Phe 

50 55 60 

TTT TCA CGA GGG CCG TCT CAG AAA ATC CAA CTT GTA AAC ACT AAC GGC 241 
Phe Ser Arg Gly Pro Ser Gin Lys He Gin Leu Val Asn Thr Asn Gly 
65 70 75 80 

AGC TGG CAC ATC AAT AGG ACT GCC CTC AAT TGC AAT GAC TCC CTT AAC 289 
Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn 

85 90 95 

ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC TTC AAC GCG TCC 337 
Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser Phe Asn Ala Ser 

100 105 110 

GGA TGT CCG GAG CGT ATG GCC GGT TGC CGC CCC ATT GAC GAG TTC GCT 385 
Gly Cys Pro Glu Arg Met Ala Gly Cys Arg Pro He Asp Glu Phe Ala 

115 120 125 

CAG GGG TGG GGT CCC ATC ACT CAT GTT GTG CCT AAC ATC TCG GAC CAG 433 
Gin Gly Trp Gly Pro lie Thr His Val Val Pro Asn He Ser Asp Glh 

130 135 140 

AGG CCC TAT TGC TGG CAC TAC GCG CCT CGA CCG TGT GGT ATC GTA CCC 481 
Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly He Val Pro 
35 145 150 155 160 

GCG TCG CAG GTG TGT GGT CCG GTG TAT TGC TTC ACC CCA AGC CCT GTT 529 
Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 

165 170 175 

40 GTG GTG GGG ACG ACC GAT CGT TTC GGC GCC CCC ACG TAC AAC TGG GGA 577 

Val Val Gly Thr Thr Asp Arg Phe Gly Ala Pro Thr Tyr Asn Trp Gly 

180 185 190 

AAC AAT GAG ACG GAT GTG CTA CTC CTC AAC AAC ACA CGG CCG CCG CAG 625 
45 Asn Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr Arg Pro Pro Gin 

195 200 205 
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50 
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w 



15 



20 



GGG AAC TGG TTT GGC TGT ACA TGG ATG AAT GGC ACT GGG TTC ACA AAG 673 
Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys 

210 215 220 

ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG GGG GTC GGC AAC AAT ACC 721 
Thr Cys Gly Gly Pro Pro Cys Asn He Gly Gly Val Gly Asn Asn Thr 
225 230 235 240 

TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG CAC CCC GAG GCC ACT TAC 769 
Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr 

245 250 255 

ACA AAA TGT GGT TCG GGG CCT TGG TTG ACG CCT AGG TGC CTA GTT CAT 817 
Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Leu Val His 

260 265 270 

TAC CCA TAC AGG CTC TGG CAC TAT CCC TGC ACT GTC AAC TTT ACC ATC 865 
Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr He 

275 280 285 

TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG GAA CAC AGG CTT GAA GCT 913 
Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala 

290 295 300 

GCA TGC AAT TGG ACC CGA GGA GAG CGT TGT GAC TTG GAG GAC AGG GAT 961 
Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp 
305 310 315 320 

AGA TCA GAG CTT AGC CCG CTA TTG CTG TCC ACA ACA GAG TGG CAG GTA 1009 
Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Glu Trp Gin Val 

325 330 335 

CTG CCC TGT TCC TTC ACC ACC CTG CCG GCT CTG TCC ACT GGT TTG ATT 1057 
Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly Leu He 

340 345 350 

35 CAT CTC CAT CAG AAC ATC GTG GAC GTG CAA TAT CTG TAC GGC ATA GGG 1105 

His Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr Gly He Gly 

355 360 365 

TCG GCG GTT GTC TCC TTC GCA ATC AAA TGG GAA TAT ATT CTG TTG CTT 1153 
40 Ser Ala Val Val Ser Phe Ala He Lys Trp Glu Tyr He Leu Leu Leu 
370 375 380 

TTC CTC CTC CTG GCG GAC GCG CGC GTC TGT GCC TGC TTG TGG ATG ATG 1201 
Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ala Cys Leu Trp Met Met 
45 385 390 395 400 

CTG CTG ATA GCC CAC GCC GAC GCC ACC TTA GAG AA 1236 
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Leu Leu lie Ala His Ala Asp Ala Thr Leu Glu 

405 410 

SEQ ID NO: 33 

SEQUENCE LENGTH: 849 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: MX25 



20 



25 



30 



35 



40 



45 



TGT GCC TGG 
Cys Ala Trp 
1 

TTG GAG AAC 
Leu Glu Asn 

GGC ATC CTC 
Gly lie Leu 
35 

GGC AGG CTG 
Gly Arg Leu 
50 

CTG CTC CTG 
Leu Leu Leu 
65 

CGG GAS ATG 
Arg Xae Met 

CTC YTG ACC 
Leu Leu Thr 

TGG TGG TTR 
Trp Trp Leu 



TTG 
Leu 

CTG 
Leu 
20 
TCT 
Ser 



TGG 
Trp 
5 

GTG 
Val 

TTC 
Phe 



GTC CCY 
Val Pro 

CTC TTG 
Leu Leu 



GCT 
Ala 

TTG 
Leu 
100 
CAA 
Gin 



GCA 
Ala 
85 
TCA 
Ser 

TAT 
Tyr 



ATG 
Met 

GTC 
Val 

CTT 
Leu 

GGG 
Gly 

MTG 
Xac 
70 
TCG 
Ser 

CCA 
Pro 

CTC 
Leu 



ATG 
Met 

CTC 
Leu 

GTG 
Val 

GCG 
Ala 
55 
GCG 
Ala 

TGC 
Cys 

TAC 
Tyr 

ATC 
He 



CTG CTG ATA GCC 
Leu Leu He Ala 
10 

AAT GCA GCA TCC 
Asn Ala Ala Ser 
25 

TTC TTC TGT GCC 
Phe Phe Cys Ala 
40 

RCA TAY GCT YTC 
Xaa Tyr Ala Xab 



CAA GCT 
Gin Ala 

ATG GCS 
Met Ala 



CTA CCS 
Leu Pro 

GGA GGC 
Gly Gly 

TAC AAA 
Tyr Lys 
105 
ACC AGR 
Thr Arg 



SCA CGG 
Xad Arg 
75 

GCG GTT 
Ala Val 

90 
GTG TTC 
Val Phe 

GCC GAG 
Ala Glu 



GCC 
Ala 

TAT 
Tyr 
60 
GCG 
Ala 



TGG 
Trp 
45 
GGC 
Gly 

TAC 
Tyr 



GAG GCC GCC 
Glu Ala Ala 
15 

GGA GCG CAT 
Gly Ala His 
30 

TAC ATC AAA 
Tyr He Lys 

GTA TGG CCG 
Val Trp Pro 



48 



TTT GTA 
Phe Val 

CTC GCT 
Leu Ala 

GCG CAC 
Ala His 



GCC 
Ala 

GGT 
Gly 

ARG 
Xaf 
110 
YTG 
Leu 



ATG 
Met 

CTG 
Leu 
95 
CTC 
Leu 



GAC 
Asp 
80 
GTA 
Val 

ATA 
He 



96 



144 



192 



240 



288 



336 



CAA GTG 
Gin Val 



384 



50 



55 
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115 

TGG ATY CCC CCY CTY 
Trp lie Pro Pro Leu 
130 

CTC ACR TGT GCG GTC 
Leu Tre Cys Ala Val 
145 

YTG CTC GCC ATA CTC 
Leu Leu Ala lie Leu 

165 

MRA RTG CCG TAC TTY 
Xai Xaj Pro Tyr Phe 

180 

TTR GTG CGG AAA GYC 
Leu Val Arg Lys Xal 
195 

AAG CTG GCY GCR CTG 
Lys Leu Ala Ala Leu 
210 

CTG CAG SAY TGG GCC 
Leu Gin Xar Trp Ala 
225 

GAG CCC GTT GYC TTC 
Glu Pro Val Xas Phe 

245 

GCA GAC ACY GCG GCG 
Ala Asp Thr Ala Ala 

260 

GCC CGG AGG GGC AAC 
Ala Arg Arg Gly Asn 
275 

Y : C or T R : 

S : G or C W : 



AAC 
Asn 

CAY 
His 
150 
GGT 
Gly 



120 
GTY CGG 
Val Arg 
135 

CCR GAG 
Pro Glu 

CCG CTC 
Pro Leu 



GGR GGC CGC 
Gly Gly Arg 



GTR CGY GCT 
Val Arg Ala 



GCY GGR 
Ala Gly 



ACA 
Thr 

CAY 
His 
230 
TCT 
Ser 



GGT 
Gly 
215 
GCG 
Ala 

GAY 
Asp 



GGT 
Gly 
200 
ACG 
Thr 

GGC 
Gly 

ATG 
Met 



CTR 
Leu 

ATG 
Met 

CAA 
Gin 
185 
CAT 
His 



ATY 
He 

GTR 
Val 
170 
GGG 
Gly 



TTT 
Phe 
155 
CTC 
Leu 

CTC 
Leu 



TAT GTY 

Tyr Val 



TAC RTT 
Tyr Xao 

CTA CGR 
Leu Arg 



Xaa 
Xad 
Xag 
Xaj 



Ala or Thr 
Ala or Pro 
Gly or Ala 
Met or Val 



TGT GGG GAC 
Cys Gly Asp 

GAG ATA CTC 
Glu lie Leu 
280 

A or G 
A or T 

Xab 
Xae 
Xah 
Xak 



GAG 
Glu 

ATC 
He 
265 
CTC 
Leu 



ACY 
Tre 
250 
ATT 
He 

GGA 
Gly 



TAT 
Tyr 

GAC 
Asp 
235 
AAG 
Lys 



GAY 
Asp 
140 
GAC 
Asp 

CAG 
Gin 

ATY 
He 

CAR 
Gin 

GWC 
Xap 
220 
CTT 
Leu 

ATC 
lie 



125 

GCC ATC ATC CTY 
Ala He He Leu 



432 



ATC 
He 

GCT 
Ala 

CGT 
Arg 

ATG 
Met 
205 
CAT 
His 



ACC 
Thr 

GSC 
Xag 

RYG 
Xak 
190 
GCY 
Ala 



AAR 
Lys 

MTA 
Xah 
175 
TGC 
Cys 



CTY 

Leu 
160 
ACY 
Thr 

ATG 
Met 



YTY RTG 
Xam Xan 



CTT RCY CCA 
Leu Xag Pro 



TTG GGC 
Leu Gly 

CCG 
Pro 



GCG GTR GCR 
Ala Val Ala 

ATC ACC TGG 
He Thr Trp 
255 

CTA CCW GTC 
Leu Pro Val 
270 



GTW 
Val 
240 
GGG 
Gly 

TCC 
Ser 



M : A or C 
H ; A or C or 

Phe or Leu 

Glu or Asp 

Leu or He 

Met or Ala 



K 
B 
Xac 
Xaf 
Xai 
Xal 



G or T 
G or T 
Met or 
Lys or 
Gin or 
Ala or 



480 



528 



576 



624 



672 



720 



768 



816 



849 



or C 

Leu 

Arg 

Arg 

Val 
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Xam : Leu or Phe 
Xap : Asp or Val 
Xas : Ala or Val 

SEQ ID NO: 34 
SEQUENCE LENGTH: 524 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 026 

ATC ACG TGG GGG GCA GAG ACG GCG GCG TGT GGG GAC ATC ATC TCG GGT 48 
He Thr Trp Gly Ala Glu Thr Ala Ala Cys Gly Asp He He Ser Gly 

15 10 15 

CTA CCC GTT TCC GCC CGA AGG GGG ARG GAG CTG CTT TTG GGR CCG GCC 96 
Leu Pro Val Ser Ala Arg Arg Gly Xaa Glu Leu Leu Leu Gly Pro Ala 

20 25 30 

GAT AGT TTT GAC GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC 144 
Asp Ser Phe Asp Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala 

35 40 45 

TAG TCC CAG CAR ACG CGG GGC CTG CTT GGT TGC ATC ATC ACY AGC CTT 192 
Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Tre Ser Leu 

50 55 60 

ACG GGC CGG GAT AAR AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT 240 
Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser 
65 70 75 80 

ACC GCA ACA CAA TCT TTC CTG GCG ACC TGY RTC AAC GGC GTK TGC TGG 288 
Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Xab Asn Gly Val Cys Trp 

85 90 95 

ACT GTT TTC CAC GGY GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC 336 
Thr Val Phe His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly 

100 105 110 

CCA ATC ACC CAA ATG TAC ACC AAT GTR GAT CAG GAC CTC GTC GGY TGG 384 



Xan : Met or Val 
Xaq : Thr or Ala 



Xao : Val or He 
Xar : Asp or His 
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Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp 

115 120 125 

TCG GCG CCC CCC SGG GCG CGT TCC TTG ACA CCW TGC ACC TGC GGC AGC 432 
Ser Ala Pro Pro Xac Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser 

130 135 140 

TCG GAC CTT TAT TTG GTC ACG AGR CAT GCT GAT GTC ATT CCG GTG CAC 480 
Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val His 
145 150 155 160 

CGG CGG GGC GAC AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC AT 524 
Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro 

165 170 
Y:CorT R : A or G M i A or C K : G or T 

S : G or C W : A or T H : A or C or T B : G or T or C 

Xaa : Arg or Lys Xab : Val or He Xac : Gly or Arg 

SEQ ID NO: 35 

SEQUENCE LENGTH: 921 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N23 



CTG CTG TCG CCC GGG CCC 
Leu Leu Ser Pro Gly Pro 

1 5 
CCG CTG CYT TGC CCC TCG 
Pro Leu Xaa Cys Pro Ser 

20 

GTG TGC ACC CGG GGG GTT 
Val Cys Thr Arg Gly Val 
35 

TCT ATG GAA ACC ACY ATG 
Ser Met Glu Thr Thr Met 



ATC TCY TAC YTG AAG GGY 
He Ser Tyr Leu Lys Gly 

10 

GGC CRT GTT GTG GGC ATC 
Gly Xab Val Val Gly He 
25 

GCG AAG GCG GTR GAC TTT 
Ala Lys Ala Val Asp Phe 
40 

CGG TCT CCG GTC TTC RCG 
Arg Ser Pro Val Phe Xac 



TCC TCG GGT GGT 48 
Ser Ser Gly Gly 
15 

TTC CGG GCT GCY 96 
Phe Arg Ala Ala 
30 

GTG CCC GTT GAG 144 
Val Pro Val Glu 
45 

GAT AAC TCA ACC 192 
Asp Asn Ser Thr 
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w 



15 



20 



25 



30 



35 



40 



45 



CCC 

Pro 
65 
ACT 
Thr 

GGG 
Gly 

TTT 
Phe 

ACT 
Thr 

TAG 
Tyr 
145 
ATC 
He 

GGC 
Gly 

GTC 
Val 

CCT 
Pro 

TAT 
Tyr 
225 
AYT 
Xag 



50 
CCG GCC 
Pro Ala 

GGC AGC 
Gly Ser 

TAC AAG 
Tyr Lys 

GGG GCG 
Gly Ala 
115 
GGG GTG 
Gly Val 
130 

GGT AAG 
Gly Lys 

ATA ATA 
He He 

ATT GGT 
He Gly 

GTG CTC 
Val Leu 
195 
AAT ATT 
Asn He 
210 

GGC AAG 
Gly Lys 

TTC TGC 
Phe Cys 



GTA CCG CAG 
Val Pro Gin 
70 

GGC AAA AGC 
Gly Lys Ser 
85 

GTA CTC GTC 
Val Leu Val 
100 

TAY ATG TCC 
Tyr Met Ser 

AGG ACC ATC 
Arg Thr He 

TTC CTC GCC 
Phe Leu Ala 
150 

TGT GAT GAG 
Cys Asp Glu 

165 
ACA GTC CTG 
Thr Val Leu 
180 

GCC ACC GCT 
Ala Thr Ala 

GAG GAG GTG 
Glu Glu Val 

GCC ATC CCC 
Ala He Pro 
230 

CAT TCC AAG 
His Ser Lys 
245 



55 
WCA TTC 
Xad Phe 

ACC ARG 
Thr Xae 

CTG AAC 
Leu Asn 

AAG GCA 
Lys Ala 
120 
ACC ACG 
Thr Thr 
135 

GAC QGT 
Asp Gly 

TGT CAT 
Cys His 

GAC CAA 
Asp Gin 



CAA GTG 
Gin Val 



GTG 
Val 

CCG 
Pro 
105 
CAT 
His 



CCG 
Pro 
90 
TCC 
Ser 

GGT 
Gly 



GCC 
Ala 
75 
GCT 
Ala 

GTT 
Val 

GTT 
Val 



60 

CAC CTA CAC GCT 
His Leu His Ala 



GCG TAT GCG 
Ala Tyr Ala 



GCT GCC 
Ala Ala 



ACG 
Thr 

GCC 
Ala 
215 
CTC 
Leu 



CCT 
Pro 
200 
TTG 
Leu 

GAG 
Glu 



AAG AAA 
Lys Lys 



GGC GCT CCC 
Gly Ala Pro 

GGC TGT TCT 
Gly Cys Ser 
155 

TCA ACT GAC 
Ser Thr Asp 
170 

GCG GAG ACG 
Ala Glu Thr 
185 

CCG GGA TCG 
Pro Gly Ser 

TCC AAC ACT 
Ser Asn Thr 

GCC ATC AAG 
Ala He Lys 
235 

TGT GAC GAG 
Cys Asp Glu 
250 



GAC 
Asp 

RTC 
Xaf 
140 
GGG 
Gly 



CCT 
Pro 
125 
ACG 
Thr 

GGT 
Gly 



ACT 
Thr 
110 
AAC 
Asn 

TAC 
Tyr 

GCC 
Ala 



TCG ACT TCC 
Ser Thr Ser 

GCT GGA GCG 
Ala Gly Ala 
190 

GTC ACC GTG 
Val Thr Val 

205 
GGA GAG ATC 
Gly Glu He 
220 

GGG GGG AGG 
Gly Gly Arg 



GCC 
Ala 
95 
TTG 
Leu 

ATC 
He 

TCC 
Ser 

TAT 
Tyr 

ATC 
He 
175 
CGC 
Arg 

CCG 
Pro 

CCC 
Pro 

CAT 
His 



CTC GCT GCG AAG 
Leu Ala Ala Lys 

255 



CCC 
Pro 
80 
CAA 
Gin 

GGC 
Gly 

AGA 
Arg 

ACC 
Thr 

GAC 
Asp 
160 
TTG 
Leu 

CTT 
Leu 

CAT 
His 

TTC 
Phe 

CTC 
Leu 
240 
CTG 
Leu 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



50 



55 
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JO 



75 



TCG GCC CTC GGA GTC AAY GCT GTA GCA TAY TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 

260 265 270 

TCC RTC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACW GAC GCT 864 
Ser Xah He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 

27 5 280 285 

CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCR GTG ATC GAC TGY AAC 912 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 

290 295 300 

ACA TGT GTC 921 
Thr Cys Val 
305 

Y:CorT R : A or G M : A or C K : G or T 

S : G or C W : A or T H : A or C or T B s G or T or 

C 

20 Xaa : Leu or Pro Xab : His or Arg Xac : Thr or 

Ala 

Xad : Ser or Thr Xae : Lys or Arg Xaf : lie or 

Val 

25 xag : Thr or He Xah : Val or He 

SEQ ID NO: 36 

SEQUENCE LENGTH: 623 base pairs 
30 SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 
35 ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N16 

40 

GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 
1 5 10 15 

45 ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATY GAG ACG 96 

Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 



50 



55 
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70 



75 



20 



20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGR GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly lie Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG YCC GCC GAG ACC TCG GTT AGG 288 
Ala Gly Cys Ala Trp Tyr Glu Leu Thr Xaa Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAY ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 110 

CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 
Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
115 120 125 

25 CAC TTC TTG TCC CAG ACY AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 

His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 

130 135 140 

GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT CCA CCT CCA 480 
30 Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro Pro 

14 5 150 155 160 

TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG AAG CCT ACG CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu 
35 165 170 175 

CAC GGG CCA ACG CCC CTG TTG YAT AGG TTA GGA GCC GTT CAG AAC RAG 576 
His Gly Pro Thr Pro Leu Leu Xab Arg Leu Gly Ala Val Gin Asn Xac 

180 185 190 

40 GTT RCC CTY ACA CAC CCY ATA ACC AAG TAC ATC ATG ACA TGC ATG TC 623 

Val Xad Leu Thr His Pro lie Thr Lys Tyr lie Met Thr Cys Met 

195 200 205 

Y:CorT R : A or G H : A or C K : G or T 

S : G or C W : A or T H : A or C or T B : G or T or C 

Xaa : Pro or Ser Xab : Tyr or His Xac : Glu or Lys 



45 



50 



55 
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Xad : Thr or Ala 



SEQ ID NO: 37 

SEQUENCE LENGTH: 623 base pairs 
SEQUENCE TYPE: nucleic acid 
STHANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: U16-4 



GGC TAT ACC GGC GAC TTC GAC TCG GTG ATC GAC TGT AAT ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 
20 1 5 10 15 

ATC CAG ACA GTC GAC TTC AGC TTG GAC CCC ACC TTC ACC ATC GAG ACG 96 
He Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

20 25 30 

25 ACT ACC GTG CCC CAA GAC GCG GTG TCA CGC TCG CAA CGG CGA GGC AGG 144 

Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGC AGG GGC AGG CAA GGC ATT TAC AGG TTT GTG ACT CCA GGA GAA 192 
30 Thr Gly Arg Gly Arg Gin Gly He Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGC GAG TGC TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
35 65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC CCG CCC GCC GAG ACC ACG GTC AGG 288 
Ala Gly Cys Ala Trp Tyr Glu Leu Pro Pro Ala Glu Thr Thr Val Arg 

85 90 95 

40 TTG CGG GCT TAC CTG AAC ACC CCA GGG CTG CCC GTC TGC CAG GAC CAT 336 

Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 110 

CTG GAG TTC TGG GAG AGC GTC TTC ACA GGC CTC ACC CAC ATA GAT GCC 384 
45 Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

115 120 125 



50 
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10 



75 



20 



30 



CAC TTC TTG TCC CAG ACC AAG CAA GCA GGA GAC AAT CTC CCT TAC CTG 432 
His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Leu Pro Tyr Leu 

130 135 140 

GTA GCG TAC CAA GCA ACA GTG TGC GCT AGA GCT CAG GCT CCA CCT CCA 480 
Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 
145 150 155 160 

TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTA AAA CCT ACA CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu 

165 170 175 

CGC GGG CCA ACG CCC CTG CTG TAT AGG CTG GGA GCC GTC CAA AAT GAG 576 
Arg Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu 

180 185 190 

GTC AAC CTC ACG CAC CCC GTA ACC AAA TAC ATC ATG ACA TGC ATG TC 623 
Val Asn Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met 
195 200 205 



SEQ ID NO: 38 

SEQUENCE LENGTH: 618 base pairs 
SEQUENCE TYPE: nucleic acid 
25 STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N13-1 



35 GCGGATCC GGC CTC ACC CAC ATA GAT GCC CAC TTC CTG TCC CAG ACC AAA 50 

Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys 
15 10 
CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG 98 
40 Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 

15 20 25 30 

TGC GCC AGG GCC AAG GCT CCA CCT CCA TCG TGG GAT CAG ATG TGG AAG 146 
Cys Ala Arg Ala Lys Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
45 35 40 45 

TGT CTC ATA CGG CTG AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG 194 



50 
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Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 

50 55 60 

TAT AGG TTA GGA GCC GTT CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He 

65 70 75 

ACC AAG TTC ATC ATG GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT 
Thr Lys Phe He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr 

80 85 90 

AGC ACT TGG GTG CTG GTA GGC GGG GTC CTC GCG GCT CTG GCC GCG TAG 
Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
95 100 105 no 

TGC CTG ACA ACG GGC AGC GTG GTC ATT GTG GGC AGG ATC GTC TTG TCC 
Cys Leu Thr Thr Gly Ser Val Val He Val Gly Arg He Val Leu Ser 

115 120 125 

GGG AGG CCG GTT GTT ATT CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC 
Gly Arg Pro Val Val He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe 

130 135 140 

GAT GAA ATG GAA GAG TGC GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA 
Asp Glu Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly 

145 150 155 

ATG CAG CTC GCC GAG CAA TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA 
Met Gin Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin 

"0 165 170 

ACA GCC ACC AAG CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG 
Thr Ala Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys 
175 180 185 190 

TGG CGA GCC CTT GAG ACC TTC TGG GCG AAG CA GGATCCGC 
Trp Arg Ala Leu Glu Thr Phe Trp Ala Lys 

195 200 



SEQ ID NO: 39 

SEQUENCE LENGTH: 969 base pai 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N15-1 



GCGGATCCT CCA CCT CCA TCG TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG 51 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg 
15 10 
CTG AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA 99 
Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly 
15 20 25 30 

GCC GTT CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC 147 
Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Phe lie 

35 40 45 

ATG GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG 195 
Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val 

50 55 60 

CTG GTA GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG 243 
Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr 

65 70 75 

GGC AGC GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC 291 
Gly Ser Val Val lie Val Gly Arg lie He Leu Ser Gly Arg Pro Ala 

80 85 90 

GTT ATT CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA 339 
Val He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu 
95 100 105 110 

GAG TGC GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC 387 
Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala 

115 120 125 

GAG CAA TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG 435 
Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys 

130 135 140 

CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT 483 
Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu 

145 150 155 

GAG ACC TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG 531 
Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin 
160 165 170 
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TAC TTA GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA 579 

Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser 

"5 180 185 190 

CTG ATG GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT 627 

Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr 

195 200 205 

ACC CTC CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC 675 
Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala 

210 215 220 

CCC CCC AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG 723 
Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala 

225 230 235 

GCT GTT GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG 771 
Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala 

240 245 250 

GGT TAT GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG 819 
Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met 
255 260 265 270 

AGC GGT GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC 867 
Ser Gly Asp Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 

275 280 285 

ATC CTC TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA 915 
He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He 

290 295 300 

CTG CGT CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC 963 
Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn 

305 310 315 

CGG CTG C AGCC 974 
Arg Leu 

320 

SEQ ID NO: 40 

SEQUENCE LENGTH: 1280 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
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ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: MX25026 

TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala 

15 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCS GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala. Trp Tyr He Lys 

35 40 45 

GGC AGG CTG GTC CCY GGG GCG RCA TAY GCT YTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Xaa Tyr Ala Xab Tyr Gly Val Trp Pro 

50 55 60 

CTG CTC CTG CTC TTG MTG GCG CTA CCS SCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Xac Ala Leu Pro Xad Arg Ala Tyr Ala Met Asp 
25 65 70 75 80 

CGG GAS ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Xae Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC YTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT ARG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Xaf Leu He 

100 105 110 

TGG TGG TTR CAA TAT CTC ATC ACC AGR GCC GAG GCG CAC YTG CAA GTG 384 
Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 

115 120 125 

TGG ATY CCC CCY CTY AAC GTY CGG GGR GGC CGC GAY GCC ATC ATC CTY 432 
Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 
40 130 135 140 

CTC ACR TGT GCG GTC CAY CCR GAG CTR ATY TTT GAC ATC ACC AAR CTY 480 
Leu Thr Cys Ala Val His Pro Glu Leu He Phe Asp He Thr Lys Leu 
145 150 155 160 

45 YTG CTC GCC ATA CTC GGT CCG CTC ATG GTR CTC CAG GCT GSC MTA ACY 528 

Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Xag Xah Thr 
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lg 5 170 175 

MRA RTG CCG TAC TTY GTR CGY GCT CAA GGG CTC ATY CGT RYG TGC ATG 576 
Xai Xaj Pro Tyr Phe Val Arg Ala Gin Gly Leu lie Arg Xak Cys Met 

1-80 185 190 

TTR GTG CGG AAA GYC GCY GGR GGT CAT TAT GTY CAR ATG GCY YTY RTG 624 
Leu Val Arg Lys Xal Ala Gly Gly His Tyr Val Gin Met Ala Xam Xan 

19 5 200 205 

AAG CTG GCY GCR CTG ACA GGT ACG TAC RTT TAT GWC CAT CTT RCY CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Xao Tyr Xap His Leu Xaq Pro 

21 <> 215 220 

CTG CAG SAY TGG GCC CAY GCG GGC CTA CGR GAC CTT GCG GTR GCR GTW 720 
Leu Gin Xar Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GYC TTC TCT GAY ATG GAG ACY AAG ATC ATC ACS TGG GGG 768 
Glu Pro Val Xas Phe Ser Asp Met Glu Thr Lys lie He Thr Trp Gly 

2 « 250 255 

GCA GAS ACB GCG GCG TGT GGG GAC ATC ATY TYG GGY CTA CCH GTY TCC 816 
Ala Xat Thr Ala Ala Cys Gly Asp He He Xau Gly Leu Pro Val Ser 

260 265 270 

GCC CGR AGG GGY ARS GAG MTR CTY YTS GGR CCG GCC GAT AGT TTT GAC 864 
Ala Arg Arg Gly Xav Glu Xaw Leu Xax Gly Pro Ala Asp Ser Phe Asp 

2 ?5 280 285 

GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAR 912 
Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser Gin Gin 

290 295 300 

ACG CGG GGC CTG CTT GGT TGC ATC ATC ACY AGC CTT ACG GGC CGG GAT 960 
Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp 
35 305 310 315 320 

AAR AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA 1008 
Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin 

325 330 335 

40 TCT TTC CTG GCG ACC TGY RTC AAC GGC GTK TGC TGG ACT GTT TTC CAC 1056 

Ser Phe Leu Ala Thr Cys Xay Asn Gly Val Cys Trp Thr Val Phe His 

340 345 350 

GGY GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA 1104 
45 Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin 

355 360 365 
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ATG TAC ACC AAT GTR 
Met Tyr Thr Asn Val 
370 

SGG GCG CGT TCC TTG 
Xaz Ala Arg Ser Leu 
385 

TTG GTC ACG AGR CAT 
Leu Val Thr Arg His 

405 

AGC AGG GGG AGC CTC 
Ser Arg Gly Ser Leu 

420 

Y t C or T R : 

S : G or C W : 

Xaa : Ala or Thr 
Xad : Ala or Pro 
Xag : Gly or Ala 
Xaj : Met or Val 
Xam ; Leu or Phe 
Xap : Asp or Val 
Xas : Ala or Val 
Xav : Asn or Arg or 
Xay : lie or Val 



GAT CAG GAC CTC 
Asp Gin Asp Leu 
375 

ACA CCW TGC ACC 
Thr Pro Cys Thr 
390 

GCT GAT GTC ATT 

Ala Asp Val lie 

CTC TCC CCC GGG 
Leu Ser Pro Gly 

425 

A or G M 
A or T H 
Xab : Phe 
Xae : Glu 
Xah : Leu 
Xak : Met 
Xan : Met 
Xaq : Thr 
Xat : Asp 
Lys Xaw : lie 
Xaz : Gly 



SEQ ID NO: 41 

SEQUENCE LENGTH: 1431 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANT I -SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N16N15 



GTC GGY TGG 
Val Gly Trp 
380 

TGC GGC AGC 
Cys Gly Ser 

395 
CCG GTG CAC 
Pro Val His 
410 

CCC AT 
Pro 

: A or C 
: A or C or 
or Leu 
or Asp 
or lie 
or Ala 
or Val 
or Ala 
or Glu 
or Leu 
or Arg 



TCG GCG 
Ser Ala 

TCG GAC 
Ser Asp 

CGG CGG 
Arg Arg 



CCC CCC 
Pro Pro 

CTT TAT 
Leu Tyr 
400 
GGC GAC 
Gly Asp 
415 



1152 



1200 



1248 



1280 



K 


: G or T 




B 


: G or T 


or C 


Xac 


: Met or 


Leu 


Xaf 


: Lys or 


Arg 


Xai 


: Gin or 


Arg 


Xal 


: Ala or 


Val 


Xao 


: Val or 


He 


Xar 


: Asp or 


His 


Xau 


: Leu or 


Ser 


Xax 


: Leu or 


Phe 



45 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 
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20 



1 5 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATY GAG ACG 96 
Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGR GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG YCC GCC GAG ACC TCG GTT AGG 288 
Ala Gly Cys Ala Trp Tyr Glu Leu Thr Xaa Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAY ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 110 

25 CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 

Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

115 120 125 

CAC TTC TTG TCC CAG ACY AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 
30 His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 

130 135 140 

GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT CCA CCT CCA 480 
Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro Pro 
35 145 150 155 160 

TCG TGG GAT CAR ATG TGG AAG TGT CTC ATA CGG CTG AAG CCT ACG CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu 

165 170 175 

40 CAC GGG CCA ACG CCC CTG TTG YAT AGG TTA GGA GCC GTT CAG AAC RAG 576 

His Gly Pro Thr Pro Leu Leu Xab Arg Leu Gly Ala Val Gin Asn Xac 

180 185 190 

GTT RCC CTY ACA CAC CCY ATA ACC AAG TWC ATC ATG RCA TGC ATG TCG 624 
4 5 Val Xad Leu Thr His Pro He Thr Lys Xae He Met Xaf Cys Met Ser 

195 200 205 
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GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA GGC GGG GTC 672 
Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val 

210 215 220 

CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC GTG GTC ATT 720 
Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val lie 
225 230 235 240 

GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT CCC GAC AGG 768 
Val Gly Arg lie lie Leu Ser Gly Arg Pro Ala Val lie Pro Asp Arg 

245 250 255 

GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC GCC TCG CAC 816 
Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His 

260 265 270 

CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA TTC AAG CAG 864 
Leu Pro Tyr lie Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin 

275 280 285 

AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG GAG GCT GCT 912 
Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala 

290 295 300 

GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC TTC TGG GCG 960 
Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala 
305 310 315 320 

AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA GCA GGC TTG 1008 
Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu 

325 330 335 

TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG GCA TTC ACA 1056 
Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr 

340 345 350 

GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC CTG TTT AAC 1104 
Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu Leu Phe Asn 

355 360 365 

ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC AGT GCC GCT 1152 
He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala 

370 375 380 

TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT GGC AGC ATA 1200 
Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He 
45 385 390 395 400 

GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT GGA GCA GGG 1248 
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Gly Leu Gly Lys Val 

405 
GCG CTC 
Ala Leu 
420 

GAC CTG 
Asp Leu 



Leu Val Asp He 



GTG 
Val 

TCC 
Ser 



GCA GGC 
Ala Gly 



GCC 
Ala 

GGC 

Gly 

465 

Y : 

S : 

Xaa 

Xad 



ACC GAG 
Thr Glu 
435 
CTG GTC 
Leu Val 
450 

CCA GGG 
Pro Gly 



GTC GGG 
Val Gly 

GAG GGG 
Glu Gly 



C or T R 
G or C w 
: Pro or Ser 
: Thr or Ala 



GTG GCC TTT AAG 
Val Ala Phe Lys 

425 

GTC AAC TTA CTC 
Val Asn Leu Leu 
440 

GTC GTG TGC GCA 
Val Val Cys Ala 
455 

GCT GTG CAG TGG 
Ala Val Gin Trp 
470 

A or G H 
A or T H 
Xab : Tyr 
Xae : Tyr 



Leu Ala Gly 
410 

GTC ATG AGC 
Val Met Ser 

CCC GCC ATC 
Pro Ala He 

GCA ATA CTG 
Ala lie Leu 
460 

ATG AAC CGG 
Met Asn Arg 

475 
: A or C 
: A or C or 
or His 
or Phe 



Tyr Gly Ala Gly 
415 

ATG CCC 
Met Pro 



GGT GAC 
Gly Asp 
430 
CTC TCT 
Leu Ser 
445 

CGT CGG 
Arg Arg 

CTG 
Leu 

K 

T B 
Xac 
Xaf 



1296 



CCT GGT 1344 
Pro Gly 

CAT GTG 1392 
His Val 

1431 



G or T 
G or T or C 
Glu or Lys 
Thr or Ala 



35 



SEQ ID NO: 42 

SEQUENCE LENGTH: 2304 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N23N15 



40 



45 



CTG CTG TCG CCC GGG CCC ATC TCY TAC YTG AAG GGY TCC TCG GGT GGT 48 
Leu Leu Ser Pro Gly Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly 

1 5 io 15 

CCG CTG CYT TGC CCC TCG GGC CRT GTT GTG GGC ATC TTC CGG GCT GCY 96 
Pro Leu Xaa Cys Pro Ser Gly Xab Val Val Gly He Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTR GAC TTT GTG CCC GTT GAG 144 
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TCT 
Ser 

ccc 

Pro 
65 
ACT 
Thr 



ATG 
Met 
50 
CCG 
Pro 

GGC 
Gly 



Thr Arg 

35 
GAA ACC 
Glu Thr 

GCC GTA 
Ala Val 

AGC GGC 
Ser Gly 



Gly Val 

ACY ATG 
Thr Met 



GGG TAG AAG 
Gly Tyr Lys 

TTT GGG GCG 
Phe Gly Ala 
115 

ACT GGG GTG 
Thr Gly Val 

130 
TAC GGT AAG 
Tyr Gly Lys 
145 

ATC ATA ATA 
lie He He 



GTA 
Val 
100 
TAY 
Tyr 

AGG 
Arg 

TTC 
Phe 

TGT 
Cys 



CCG 
Pro 

AAA 
Lys 
85 
CTC 
Leu 



CAG 
Gin 
70 
AGC 
Ser 

GTC 
Val 



Ala Lys Ala 
40 

CGG TCT CCG 
Arg Ser Pro 
55 

WCA TTC CAA 
Xad Phe Gin 

ACC ARG GTG 
Thr Xae Val 



Val Asp Phe Val Pro Val Glu 

45 

RCG GAT AAC 
Xac Asp Asn 
60 

CAC CTA CAC 
His Leu His 



GTC TTC 
Val Phe 



TCA ACC 

Ser Thr 



GTG 
Val 



ATG TCC 
Met Ser 

ACC ATC 
Thr He 



GGC ATT 
Gly He 

GTC GTG 
Val Val 

CCT AAT 
Pro Asn 
210 
TAT GGC 
Tyr Gly 



GGT ACA 
Gly Thr 
180 
CTC GCC 
Leu Ala 
195 

ATT GAG 
He Glu 

AAG GCC 
Lys Ala 



CTC 
Leu 

GAT 
Asp 
165 
GTC 
Val 



GCC 
Ala 
150 
GAG 
Glu 

CTG 
Leu 



CTG 
Leu 

AAG 
Lys 

ACC 
Thr 
135 
GAC 
Asp 



AAC 
Asn 

GCA 
Ala 
120 
ACG 
Thr 



CCG 
Pro 
105 
CAT 
His 

GGC 
Gly 



CCG 
Pro 
90 
TCC 
Ser 



GCC 
Ala 
75 
GCT 
Ala 

GTT 
Val 



GCT 
Ala 



GGT GGC 
Gly Gly 



TGT CAT TCA 
Cys His Ser 



ACC GCT 
Thr Ala 

GAG GTG 
Glu Val 

ATC CCC 
He Pro 



GAC 
Asp 

ACG 
Thr 

GCC 
Ala 
215 
CTC 
Leu 



CAA 
Gin 

CCT 
Pro 
200 
TTG 
Leu 



GCG 
Ala 
185 
CCG 
Pro 

TCC 
Ser 



GAG GCC 
Glu Ala 



GGT GTT 
Gly Val 

GCT CCC 
Ala Pro 

TGT TCT 
Cys Ser 
155 
ACT GAC 
Thr Asp 
170 

GAG ACG 
Glu Thr 

GGA TCG 
Gly Ser 

AAC ACT 
Asn Thr 

ATC AAG 
He Lys 



GCG TAT GCG 
Ala Tyr Ala 

GCT GCC ACT 
Ala Ala Thr 
110 

GAC CCT AAC 
Asp Pro Asn 

125 
RTC ACG TAC 
Xaf Thr Tyr 
140 

GGG GGT GCC 
Gly Gly Ala 

TCG ACT TCC 
Ser Thr Ser 

GCT GGA GCG 
Ala Gly Ala 
190 

GTC ACC GTG 
Val Thr Val 

205 
GGA GAG ATC 
Gly Glu He 
220 

GGG GGG AGG 
Gly Gly Arg 



GCC 
Ala 
95 
TTG 
Leu 



CCC 
Pro 
80 
CAA 
Gin 

GGC 
Gly 



ATC AGA 
He Arg 

TCC ACC 
Ser Thr 



TAT 
Tyr 

ATC 
He 
175 
CGC 
Arg 



GAC 
Asp 
160 
TTG 
Leu 

CTT 
Leu 



CCG CAT 
Pro His 

CCC TTC 
Pro Phe 

CAT CTC 
His Leu 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 
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225 230 235 240 

AYT TTC TGC CAT TCC AAG AAG AAA TGT GAG GAG CTC GCT GCG AAG CTG 768 
Xag Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

TCG GCC CTC GGA GTC AAY GCT GTA GCA TAY TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 

260 265 270 

TCC RTC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACW GAC GCT 864 
Ser Xah lie Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 

275 280 285 

CTA ATG ACG GGC TAT ACC GGY GAC TTY GAC TCR GTG ATC GAC TGY AAC 912 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn 

290 295 300 

ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC 960 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
305 310 315 320 

ATY GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG 1008 
lie Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 

325 330 335 

25 CGA GGC AGG ACT GGT AGG GGC AGR GGG GGC ATA TAC AGG TTT GTA ACT 1056 

Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly lie Tyr Arg Phe Val Thr 

340 345 350 

CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA 1104 
50 Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 

355 360 365 

TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG YCC GCC GAG ACC 1152 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Xai Ala Glu Thr 
35 370 375 380 

TCG GTT AGG TTG CGG GCT TAC CTA AAY ACA CCT GGG CTG CCC GTC TGC 1200 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
385 390 395 400 

40 CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC 1248 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 

405 410 415 

ATA GAT GCC CAC TTC TTG TCC CAG ACY AAA CAG GCA GGA GAC AAC TTC 1296 
45 He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 

420 425 430 
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CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT 1344 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala 

435 440 445 

CCA CCT CCA TCG TGG GAT CAR ATG TGG AAG TGT CTC ATA CGG CTG AAG 1392 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 

450 455 460 

CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG YAT AGG TTA GGA GCC GTT 1440 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Xaj Arg Leu Gly Ala Val 
465 470 475 480 

CAG AAC RAG GTT RCC CTY ACA CAC CCY ATA ACC AAG TWC ATC ATG RCA 1488 
Gin Asn Xak Val Xal Leu Thr His Pro He Thr Lys Xam He Met Xan 

485 490 495 

TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA 1536 
Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val 

500 505 510 

GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC 1584 
Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser 

515 520 525 

GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT 1632 
Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val He 

530 535 540 

CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC 1680 
Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys 
30 545 550 555 560 

GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA 1728 
Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin 

565 570 575 

TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG 1776 
Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala 

580 585 590 

GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC 1824 
Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr 

595 600 605 

TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA 1872 
Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu 
45 610 615 620 

GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG 1920 
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Ala 
625 
GCA 
Ala 

CTG 
Leu 

AGT 
Ser 

GGC 
Gly 

GGA 
Gly 
705 
GAC 
Asp 

TCT 
Ser 

CGG 
Arg 

Y : 
S : 
Xaa 
Xad 
Xag 
Xaj 
Xam 



Gly Leu Ser Thr 



TTC 
Phe 

TTT 
Phe 

GCC 
Ala 

AGC 
Ser 
690 
GCA 
Ala 



ACA 

Thr 

AAC 
Asn 

GCT 
Ala 
675 
ATA 
lie 



GCC 
Ala 

ATC 
lie 
660 
TCA 
Ser 



TCT 
Ser 
645 
TTG 
Leu 

GCC 
Ala 



Leu Pro Gly Asn 
630 

ATC ACC AGC CCG 
lie Thr Ser Pro 



GGC CTC 
Gly Leu 



GGG GTG GCA 
Gly Val Ala 



ATG CCC TCC ACC 
Met Pro Ser Thr 

725 

CCT GGT GCC CTG 
Pro Gly Ala Leu 
740 

CAT GTG GGC CCA 
His Val Gly Pro 
755 

C or T R : 

G or C W • 

Leu or Pro 
Ser or Thr 
Thr or lie 
Tyr or His 
Tyr or Phe 



GGG GGA TGG GTG 
Gly Gly Trp Val 

665 

TTC GTG GGC GCC 
Phe Val Gly Ala 
680 

GGG AAG GTG CTT 
Gly Lys Val Leu 
695 

GGC GCG CTC GTG 
Gly Ala Leu Val 
710 

GAG GAC CTG GTC 
Glu Asp Leu Val 

GTC GTC GGG GTC 
Val Val Gly Val 

745 

GGG GAG GGG GCT 
Gly Glu Gly Ala 
760 

A or G M 
A or T H 
Xab : His 
Xae : Lys 
Xah : Val 
Xak : Gin 
Xan : Thr 



Pro Ala lie 
635 

CTC ACC ACC 
Leu Thr Thr 
650 

GCC GCC CAA 
Ala Ala Gin 

GGT ATA GCT 
Gly He Ala 



Ala Ser Leu 



CAA TAT 
Gin Tyr 



GTG 
Val 

GCC 
Ala 

AAC 
Asn 
730 
GTG 
Val 



GAC 
Asp 

TTT 
Phe 
715 
TTA 
Leu 



ATT 
He 
700 
AAG 
Lys 

CTC 
Leu 



TGC GCA 
Cys Ala 



GTG CAG TGG 
Val Gin Trp 

: A or C 
: A or C or 
or Arg 
or Arg 
or He 
or Lys 
or Ala 



CTC 
Leu 

GGC 
Gly 
685 
CTG 
Leu 

GTC 
Val 

CCC 
Pro 

GCA 
Ala 

ATG 
Met 
765 



GCC 
Ala 
670 
GCG 
Ala 

GCG 
Ala 

ATG 
Met 

GCC 
Ala 

ATA 
He 
750 
AAC 
Asn 

K 
B 

Xac 
Xaf 
Xai 
Xal 



ACC 
Thr 
655 
CCC 
Pro 

GCT 
Ala 

GGT 
Gly 



Met 
640 
CTC 
Leu 

CCC 
Pro 

GTT 
Val 

TAT 
Tyr 



AGC GGT 
Ser Gly 
720 
ATC CTC 
He Leu 
735 

CTG CGT 
Leu Arg 

CGG CTG 
Arg Leu 



G or T 
G or T 
Thr or 
He or 
Pro or 
Thr or 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



or C 

Ala 

Val 

Ser 

Ala 



SEQ ID NO: 43 

SEQUENCE LENGTH: 3564 base pairs 
SEQUENCE TYPE: nucleic acid 
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STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: MX25N15 



TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala 

15 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCS GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAG ATC AAA 144 
Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys 

35 40 45 

GGC AGG CTG GTC CCY GGG GCG RCA TAY GCT YTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Xaa Tyr Ala Xab Tyr Gly Val Trp Pro 
25 50 55 60 

CTG CTC CTG CTC TTG MTG GCG CTA CCS SCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Xac Ala Leu Pro Xad Arg Ala Tyr Ala Met Asp 
g 5 70 75 80 

CGG GAS ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Xae Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC YTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT ARG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Xaf Leu He 

100 105 110 

TGG TGG TTR CAA TAT CTC ATC ACC AGR GCC GAG GCG CAC YTG CAA GTG 384 
Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 
40 115 120 125 

TGG ATY CCC CCY CTY AAC GTY CGG GGR GGC CGC GAY GCC ATC ATC CTY 432 
Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 
130 135 140 

45 CTC ACR TGT GCG GTC CAY CCR GAG CTR ATY TTT GAC ATC ACC AAR CTY 480 

Leu Thr Cys Ala Val His Pro Glu Leu lie Phe Asp He Thr Lys Leu 
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145 150 155 160 

YTG CTC GCC ATA CTC GGT CCG CTC ATG GTR CTC CAG GCT GSC MTA ACY 528 
Leu Leu Ala lie Leu Gly Pro Leu Met Val Leu Gin Ala Xag Xah Thr 

165 170 175 

MRA RTG CCG TAC TTY GTR CGY GCT CAA GGG CTC ATY CGT RYG TGC ATG 576 
Xai Xaj Pro Tyr Phe Val Arg Ala Gin Gly Leu lie Arg Xak Cys Met 

180 185 190 

TTR GTG CGG AAA GYC GCY GGR GGT CAT TAT GTY CAR ATG GCY YTY RTG 624 
Leu Val Arg Lys Xal Ala Gly Gly His Tyr Val Gin Met Ala Xam Xan 

195 200 205 

AAG CTG GCY GCR CTG ACA GGT ACG TAC RTT TAT GWC CAT CTT RCY CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Xao Tyr Xap His Leu Xag Pro 

210 215 220 

CTG CAG SAY TGG GCC CAY GCG GGC CTA CGR GAC CTT GCG GTR GCR GTW 720 
Leu Gin Xar Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GYC TTC TCT GAY ATG GAG ACY AAG ATC ATC ACS TGG GGG 768 
Glu Pro Val Xas Phe Ser Asp Met Glu Thr Lys lie lie Thr Trp Gly 

245 250 255 

GCA GAS ACB GCG GCG TGT GGG GAC ATC ATY TYG GGY CTA CCH GTY TCC 816 
Ala Xat Thr Ala Ala Cys Gly Asp lie lie Xau Gly Leu Pro Val Ser 

260 265 270 

GCC CGR AGG GGS ARS GAG MTR CTY YTS GGR CCG GCC GAT AGT TTT GAC 864 
Ala Arg Arg Gly Xav Glu Xaw Leu Xax Gly Pro Ala Asp Ser Phe Asp 

275 280 285 

GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAR 912 
Gly Gin Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ser Gin Gin 

290 295 300 

ACG CGG GGC CTG CTT GGT TGC ATC ATC ACY AGC CTT ACG GGC CGG GAT 960 
Thr Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Thr Gly Arg Asp 
305 310 315 320 

40 AAR AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA 1008 

Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin 

325 330 335 

TCT TTC CTG GCG ACC TGY RTC AAC GGC GTK TGC TGG ACT GTT TTC CAC 1056 
Ser Phe Leu Ala Thr Cys Xay Asn Gly Val Cys Trp Thr Val Phe His 

340 345 350 
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GGY GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA 1104 
Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin 

355 360 365 

ATG TAC ACC AAT GTR GAT CAG GAC CTC GTC GGY TGG TCG GCG CCC CCC 1152 
Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Ser Ala Pro Pro 

370 375 380 

SGG GCG CGT TCC TTG ACA CCW TGC ACC TGC GGC AGC TCG GAC CTT TAT 1200 
Xaz Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
385 390 395 400 

TTG GTC ACG AGR CAT GCT GAT GTC ATT CCG GTG CAC CGG CGG GGC GAC 1248 
Leu Val Thr Arg His Ala Asp Val lie Pro Val His Arg Arg Gly Asp 

405 410 415 

AGC AGG GGG AGC CTS CTS TCS CCC GGG CCC ATC TCY TAC YTG AAG GGY 1296 
Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly 

420 425 430 

TCC TCG GGT GGT CCG CTG CYT TGC CCC TCG GGC CRT GTT GTG GGC ATC 1344 
Ser Ser Gly Gly Pro Leu Xba Cys Pro Ser Gly Xbb Val Val Gly He 

435 440 445 

TTC CGG GCT GCY GTG TGC ACC CGG GGG GTT GCG AAG GCG GTR GAC TTT 1392 
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe 

450 455 460 

GTG CCC GTT GAG TCT ATG GAA ACC ACY ATG CGG TCT CCG GTC TTC RCG 1440 
Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Xbc 
465 470 475 480 

GAT AAC TCA ACC CCC CCG GCC GTA CCG CAG WCA TTC CAA GTG GCC CAC 1488 
Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Xbd Phe Gin Val Ala His 

485 490 495 

CTA CAC GCT CCC ACT GGC AGC GGC AAA AGC ACC ARG GTG CCG GCT GCG 1536 
Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Xbe Val Pro Ala Ala 

500 505 510 

TAT GCG GCC CAA GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT 1584 
Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala 

515 520 525 

GCC ACT TTG GGC TTT GGG GCG TAY ATG TCC AAG GCA CAT GGT GTT GAC 1632 
Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp 
4 5 530 535 540 

CCT AAC ATC AGA ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC RTC 1680 
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Pro Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Xbf 
545 550 555 560 

ACG TAC TCC ACC TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG 1728 
Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly 

565 570 575 

GGT GCC TAT GAC ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG 1776 
Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ser 

580 585 590 

ACT TCC ATC TTG GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT 1824 
Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala 

595 600 605 

GGA GCG CGC CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC 1872 
Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val 

610 615 620 

ACC GTG CCG CAT CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA 1920 
Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly 
625 630 635 640 

GAG ATC CCC TTC TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG 1968 
Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly 

645 650 655 

GGG AGG CAT CTC AYT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC 2016 
Gly Arg His Leu Xbg Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu 

660 665 670 

GCT GCG AAG CTG TCG GCC CTC GGA GTC AAY GCT GTA GCA TAY TAC CGG 2064 
Ala Ala Lys Leu Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg 

675 680 685 

GGT CTT GAT GTG TCC RTC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG 2112 
Gly Leu Asp Val Ser Xbh He Pro Thr Ser Gly Asp Val Val Val Val 

690 695 700 

GCA ACW GAC GCT CTA ATG ACG GGC TAT ACC GGY GAC TTY GAC TCA GTG 2160 
Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 
40 705 710 715 720 

ATC GAC TGC AAC ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC 2208 
He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp 

725 730 735 

45 CCT ACT TTC ACC ATY GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG 2256 

Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser 
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740 745 750 

CGC TCG CAG CGG CGA GGC AGG ACT GGT AGG GGC AGR GGG GGC ATA TAC 2304 
Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly lie Tyr 

755 760 765 

AGG TTT GTA ACT CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG 2352 
Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser 

770 775 780 

GTC CTG TGT GAA TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG 2400 
Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr 
785 790 795 800 

YCC GCC GAG ACC TCG GTT AGG TTG CGG GCT TAC CTA AAY ACA CCT GGG 2448 
Xbi Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly 

805 810 815 

CTG CCC GTC TGC CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC 2496 
Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr 

820 825 830 

GGC CTC ACC CAC ATA GAT GCC CAC TTC TTG TCC CAG ACY AAA CAG GCA 2544 
Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala 
835 840 845 

25 GGA GAC AAC TTC CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG TGC GCC 2592 

Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala 

850 855 860 

AGG GCC AAG GCT CCA CCT CCA TCG TGG GAT CAR ATG TGG AAG TGT CTC 2640 
30 Arg Ala Lys Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu 
865 870 875 880 

ATA CGG CTG AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG YAT AGG 2688 
lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Xbj Arg 
35 885 890 895 

TTA GGA GCC GTT CAG AAC RAG GTT RCC CTY ACA CAC CCY ATA ACC AAG 2736 
Leu Gly Ala Val Gin Asn Xbk Val Xbl Leu Thr His Pro He Thr Lys 

900 905 910 

40 TWC ATC ATG RCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT 2784 

Xbm He Met xbn Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr 

915 920 925 

TGG GTG CTG GTA GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG 2832 
45 Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu 

930 935 940 
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ACA ACG GGC AGC GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG 2880 
Thr Thr Gly S r Val Val lie Val Gly Arg lie lie Leu Ser Gly Arg 
945 950 955 960 

CCG GCC GTT ATT CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA 2928 
Pro Ala Val lie Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu 

965 970 975 

ATG GAA GAG TGC GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG 2976 
Met Glu Glu Cys Ala Ser His Leu Pro Tyr lie Glu Gin Gly Met Gin 

980 985 990 

CTC GCC GAG CAA TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC 3024 
Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala 
75 995 1000 1005 

ACC AAG CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA 3072 
Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg 

1010 1015 1020 

GCC CTT GAG ACC TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG 3120 
Ala Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly 
102 5 1030 1035 1040 

ATA CAG TAC TTA GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA 3168 
lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie 

1045 1050 1055 

GCA TCA CTG ATG GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC 3216 
Ala Ser Leu Met Ala Phe Thr Ala Ser lie Thr Ser Pro Leu Thr Thr 
30 1060 1065 1070 

CAA TAT ACC CTC CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA 3264 
Gin Tyr Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin 
1075 1080 1085 

35 CTC GCC CCC CCC AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT 3312 

Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly lie Ala 

1090 1095 1100 

GGC GCG GCT GTT GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT 3360 
40 Gly Ala Ala Val Gly Ser lie Gly Leu Gly Lys Val Leu Val Asp lie 

li0 5 1110 1115 1120 

CTG GCG GGT TAT GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG 3408 
Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys 
45 1125 1130 1135 

GTC ATG AGC GGT GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC 3456 
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Val Met Ser Gly Asp 

1140 

CCC GCC ATC CTC TCT 
Pro Ala lie Leu Ser 
1155 

GCA ATA CTG CGT CGG 
Ala lie Leu Arg Arg 
1170 

ATG AAC CGG CTG 
Met Asn Arg Leu 
1185 



Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu 

1145 1150 
CCT GGT GCC CTG GTC GTC GGG GTC GTG 
Pro Gly Ala Leu Val Val Gly Val Val 

1160 1165 
CAT GTG GGC CCA GGG GAG GGG GCT GTG 
His Val Gly Pro Gly Glu Gly Ala Val 



1175 



1180 



TGC GCA 3504 
Cys Ala 

CAG TGG 3552 
Gin Trp 

3564 



Y : 


C 


or T 


R : 


A or 


G 


M 


: A or 


C 


K 


: G or T 




S : 


G 


or C 


W : 


A or 


T 


H 


: A or 


C or T 


B : 


: G or T 


or C 


Xaa 




Ala 


or 


Thr 




Xab 


: Phe 


or 


Leu 




Xac : 


: Met 


or 


Leu 


Xad 




Ala 


or 


Pro 




Xae 


: Glu 


or 


Asp 




Xaf ; 


: Lys 


or 


Arg 


Xag 




Gly 


or 


Ala 




Xah 


: Leu 


or 


He 




Xai : 


: Gin 


or 


Arg 


Xaj 




Met 


or 


Val 




Xak 


: Met 


or 


Ala 




Xal ! 


: Ala 


or 


Val 


Xam 




Leu 


or 


Phe 




Xan 


: Met 


or 


Val 




Xao ; 


: Val 


or 


He 


Xap 




Asp 


or 


Val 




Xag 


: Thr 


or 


Ala 




Xar 5 


i Asp 


or 


His 


Xas 




Ala 


or 


Val 




Xat 


: Asp 


or 


Glu 




Xau 3 


t Leu 


or 


Ser 


Xav 




Asn 


or 


Arg or 


Lys 


Xaw 


: He 


or 


Leu 




Xax ; 


: Leu 


or 


Phe 


Xay 




He 


or 


Val 




Xaz 


: Gly 


or 


Arg 




Xba : 


: Leu 


or 


Pro 


Xbb 




His 


or 


Arg 




Xbc 


: Thr 


or 


Ala 




Xbd : 


i Ser 


or 


Thr 


Xbe 




Lys 


or 


Arg 




Xbf 


: He 


or 


Val 




Xbg : 


t Thr 


or 


He 


Xbh 




Val 


or 


He 




Xbi 


: Pro 


or 


Ser 




Xbj : 


: Tyr 


or 


His 


Xbk 




Glu 


or 


Lys 




Xbl 


: Thr 


or 


Ala 




Xbm : 


r Tyr 


or 


Phe 


Xbn 




Thr 


or 


Ala 























SEQ ID NO: 44 

SEQUENCE LENGTH: 849 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
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CLONE: MX25-1 

TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu lie Ala Gin Ala Glu Ala Ala 

15 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly lie Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr lie Lys 

35 40 45 

GGC AGG CTG GTC CCT GGG GCG GCA TAC GCT TTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 

50 55 60 

CTG CTC CTG CTC TTG ATG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Met Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp 
65 70 75 80 

CGG GAG ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC TTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu He 

100 105 110 

TGG TGG TTG CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG 384 
Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 

"5 120 125 

TGG ATC CCC CCC CTC AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT 432 
Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 

130 135 140 

CTC ACA TGT GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC 480 
Leu Thr Cys Ala Val His Pro Glu Leu lie Phe Asp He Thr Lys Leu 
40 145 150 155 160 

TTG CTC GCC ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC 528 
Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr 

165 170 175 

45 CAA ATG CCG TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG 576 

Gin Met Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Met Cys Met 
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180 185 190 

TTG GTG CGG AAA GCC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG 624 
Leu Val Arg Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Leu Met 

195 200 205 

AAG CTG GCT GCA CTG ACA GGT ACG TAG GTT TAT GAC CAT CTT ACT CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro 

210 215 220 

CTG CAG GAC TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT 720 
Leu Gin Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GCC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACC TGG GGG 768 
Glu Pro Val Ala Phe Ser Asp Met Glu Thr Lys lie lie Thr Trp Gly 

245 250 255 

GCA GAC ACT GCG GCG TGT GGG GAC ATC ATT TTG GGC CTA CCT GTC TCC 816 
Ala Asp Thr Ala Ala Cys Gly Asp lie He Leu Gly Leu Pro Val Ser 

260 265 270 

GCC CGG AGG GGC AAC GAG ATA CTC CTC GGA CCG 849 
Ala Arg Arg Gly Asn Glu He Leu Leu Gly Pro 
275 280 

SEQ ID NO: 45 

SEQUENCE LENGTH: 849 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: MX25-2 

40 TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 

Cys Ala Trp Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala 

1 5 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 
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GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly lie Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr lie Lys 

35 40 45 

GGC AGG CTG GTC CCT GGG GCG ACA TAC GCT CTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Thr Tyr Ala Leu Tyr Gly Val Trp Pro 

50 55 60 

CTG CTC CTG CTC TTG ATG GCG CTA CCG CCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Met Ala Leu Pro Pro Arg Ala Tyr Ala Met Asp 
65 70 75 80 

CGG GAC ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Asp Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC TTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AGG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Arg Leu lie 

100 105 110 

20 TGG TGG TTA CAA TAT CTC ATC ACC AGA GCC GAG GCG CAC CTG CAA GTG 384 

Trp Trp Leu Gin Tyr Leu lie Thr Arg Ala Glu Ala His Leu Gin Val 

115 120 125 

TGG ATT CCC CCT CTC AAC GTC CGG GGA GGC CGC GAC GCC ATC ATC CTC 432 
25 Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala lie lie Leu 

130 135 140 

CTC ACG TGT GCG GTC CAT CCA GAG CTA ATT TTT GAC ATC ACC AAA CTT 480 
Leu Thr Cys Ala Val His Pro Glu Leu lie Phe Asp lie Thr Lys Leu 
30 145 150 155 160 

CTG CTC GCC ATA CTC GGT CCG CTC ATG GTG CTC CAG GCT GCC ATA ACT 528 
Leu Leu Ala lie Leu Gly Pro Leu Met Val Leu Gin Ala Ala lie Thr 

165 170 175 

35 AGA GTG CCG TAC TTC GTA CGC GCT CAA GGG CTC ATC CGT GCG TGC ATG 576 

Arg Val Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Ala Cys Met 

180 185 190 

TTA GTG CGG AAA GCC GCC GGA GGT CAT TAT GTT CAA ATG GCC TTT GTG 624 
40 Leu Val Arg Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Phe Val 

195 200 205 

AAG CTG GCC GCG CTG ACA GGT ACG TAC ATT TAT GAC CAT CTT GCC CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr He Tyr Asp His Leu Ala Pro 
45 210 215 220 

CTG CAG CAT TGG GCC CAT GCG GGC CTA CGG GAC CTT GCG GTG GCG GTA 720 
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Leu Gin His Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GTC TTC TCT GAC ATG GAG ACC AAG ATC ATC ACC TGG GGG 768 
Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys He He Thr Trp Gly 

245 250 255 

GCA GAC ACC GCG GCG TGT GGG GAC ATC ATT TTG GGC CTA CCA GTC TCC 816 
Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu Pro Val Ser 

260 265 270 

GCC CGG AGG GGC AAC GAG ATA CTC CTC GGA CCG 849 
Ala Arg Arg Gly Asn Glu He Leu Leu Gly Pro 
275 280 



SEQ ID NO:46 

SEQUENCE LENGTH: 849 base pairs 

SEQUENCE TYPE: nucleic acid 
20 STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
25 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: MX25-3 

30 TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 

Cys Ala Trp Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala 

1 5 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCC GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys 

35 40 45 

GGC AGG CTG GTC CCC GGG GCG GCA TAT GCT TTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 

50 55 60 

CTG CTC CTG CTC TTG CTG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Leu Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp 
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65 70 75 80 

CGG GAG ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC CTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu He 

100 105 . HO 

TGG TGG TTG CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG 384 
Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 

115 120 125 

TGG ATC CCC CCC CTT AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT 432 
Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 

130 135 140 

CTC ACA TGT GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC 480 
Leu Thr Cys Ala Val His Pro Glu Leu He Phe Asp He Thr Lys Leu 
145 150 155 160 

TTG CTC GCC ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC 528 
Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr 

165 170 175 

CAA ATG CCG TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG 576 
Gin Met Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Met Cys Met 

180 185 190 

TTG GTG CGG AAA GTC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG 624 
Leu Val Arg Lys Val Ala Gly Gly His Tyr Val Gin Met Ala Leu Met 

195 200 205 

AAG CTG GCT GCA CTG ACA GGT ACG TAC GTT TAT GTC CAT CTT ACT CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Val His Leu Thr Pro 

210 215 220 

CTG CAG GAC TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT 720 
* Leu Gin Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GTC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACC TGG GGG 768 
Glu Pro Val Val Phe Ser Asp Met Glu Thr Lys He He Thr Trp Gly 

245 250 255 

GCA GAC ACC GCG GCG TGT GGG GAC ATC ATT TTG GGC CTA CCT GTC TCC 816 
45 Ala Asp Thr Ala Ala Cys Gly Asp He He Leu Gly Leu Pro Val Ser 

2 60 265 270 
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GCC CGG AGG GGC AAC GAG ATA CTC CTC GGA CCG 849 
Ala Arg Arg Gly Asn Glu lie Leu Leu Gly Pro 
275 280 

SEQ ID NO: 47 

SEQUENCE LENGTH: 524 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 026-1 
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ATC ACG 
He Thr 
1 

CTA CCC 
Leu Pro 

GAT AGT 
Asp Ser 

TAC TCC 
Tyr Ser 
50 

ACG GGC 
Thr Gly 
65 

ACC GCA 
Thr Ala 

ACT GTT 
Thr Val 

CCA ATC 



TGG GGG 
Trp Gly 



GTT 
Val 

TTT 
Phe 
35 
CAG 
Gin 



TCC 
Ser 
20 
GAC 
Asp 

CAG 
Gin 



GCA 
Ala 
5 

GCC 
Ala 

GGG 
Gly 

ACG 
Thr 



CGG GAT AAG 
Arg Asp Lys 

ACA CAA TCT 
Thr Gin Ser 
85 

TTC CAC GGC 
Phe His Gly 

100 
ACC CAA ATG 



GAG ACG GCG 
Glu Thr Ala 

CGA AGG GGG 
Arg Arg Gly 

CAG GGG TGG 
Gin Gly Trp 
40 

CGG GGC CTG 
Arg Gly Leu 
55 

AAC CAG GTC 
Asn Gin Val 
70 

TTC CTG GCG 
Phe Leu Ala 

GCC GGC TCG 
Ala Gly Ser 

TAC ACC AAT 



GCG TGT GGG GAC 
Ala Cys Gly Asp 
10 

AGG GAG CTG CTT 
Arg Glu Leu Leu 
25 

CGA CTC CTT GCG 
Arg Leu Leu Ala 



ATC ATC 
He He 



CTT GGT 
Leu Gly 

GAG GGG 
Glu Gly 

ACC TGT 
Thr Cys 
90 

AAG ACC 
Lys Thr 
105 

GTA GAT 



TGC ATC 
Cys He 
60 

GAG GTT- 
Glu Val 

75 
GTC AAC 
Val Asn 

TTA GCC 
Leu Ala 

CAG GAC 



TTG 
Leu 

CCT 
Pro 
45 
ATC 
He 



GGG 
Gly 
30 

He 

ACT 
Thr 



CAA GTG 
Gin Val 

GGC GTG 
Gly Val 

GGC CCA 
Gly Pro 
110 
CTC GTC 



TCG GGT 48 
Ser Gly 
15 

CCG GCC 96 
Pro Ala 

ACG GCC 144 
Thr Ala 

AGC CTT 192 
Ser Leu 

GTC TCT 240 
Val Ser 
80 

TGC TGG 288 
Cys Trp 
95 

AAA GGC 336 
Lys Gly 

GGC TGG 384 
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Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp 

115 120 125 

TCG GCG CCC CCC CGG GCG CGT TCC TTG ACA CCT TGC ACC TGC GGC AGC 432 
Ser Ala Pro Pro Arg Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser 

130 135 140 

TCG GAG CTT TAT TTG GTC ACG AGG CAT GCT GAT GTC ATT CCG GTG CAC 480 
Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val His 

150 155 160 

CGG CGG GGC GAC AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC AT 524 
Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro 

165 170 

SEQ ID NO:48 

SEQUENCE LENGTH: 514 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANT I -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 026-2 



ATC ACG TGG GGG GCA GAG ACG GCG GCG TGT GGG GAC ATC ATC TCG GGT 48 
He Thr Trp Gly Ala Glu Thr Ala Ala Cys Gly Asp He He Ser Gly 

1 5 10 15 

CTA CCC GTT TCC GCC CGA AGG GGG AGG GAG CTG CTT TTG GGA CCG GCC 96 
Leu Pro Val Ser Ala Arg Arg Gly Arg Glu Leu Leu Leu Gly Pro Ala 

20 25 30 

GAT AGT TTT GAC GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC 144 
Asp Ser Phe Asp Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala 
40 35 40 45 

TAC TCC CAG CAG ACG CGG GGC CTG CTT GGT TGC ATC ATC ACC AGC CTT 192 
Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He lie Thr Ser Leu 
50 55 60 

45 ACG GGC CGG GAT AAG AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT 240 

Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser 
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65 70 75 80 

ACC GCA ACA CAA TCT TTC CTG GCG ACC TGC ATC AAC GGC GTT TGC TGG 288 
Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp 

85 90 95 

ACT GTT TTC CAC GGC GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC 336 
Thr Val Phe His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly 

100 105 110 

CCA ATC ACC CAA ATG TAC ACC AAT GTA GAT CAG GAC CTC GTC GGC TGG 384 
Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp 

115 120 125 

TCG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCT TGC ACC TGC GGC AGC 432 
Ser Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser 

130 135 140 

TCG GAC CTT TAT TTG GTC ACG AGG CAT GCT GAT GTC ATT CCG GTG CAC 480 
Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val His 
145 150 155 160 

CGG CGG GGC GAC AGC AGG GGG AGC CTC CTC TCC C 514 
Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 

165 170 



SEQ ID NO: 49 

SEQUENCE LENGTH: 523 base pairs 

SEQUENCE TYPE: nucleic acid 
30 STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
35 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 026-3 

40 ATC ACG TGG GGG GCA GAG ACG GCG GCG TGT GGG GAC ATC ATC TCG GGT 48 

He Thr Trp Gly Ala Glu Thr Ala Ala Cys Gly Asp He He Ser Gly 

1 5 10 15 

CTA CCC GTT TCC GCC CGA AGG GGG AAG GAG CTG CTT TTG GGA CCG GCC 96 
45 Leu Pro Val Ser Ala Arg Arg Gly Lys Glu Leu Leu Leu Gly Pro Ala 

20 25 30 
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GAT AGT TTT GAC GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC 144 
Asp Ser Phe Asp Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala 

35 40 45 

TAC TCC CAG CAA ACG CGG GGC CTG CTT GGT TGC ATC ATC ACT AGC CTT 192 
Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu 

50 55 60 

ACG GGC CGG GAT AAA AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT 240 
Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser 
65 7 ° 75 80 

ACC GCA ACA CAA TCT TTC CTG GCG ACC TGT GTC AAC GGC GTG TGC TGG 288 
Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp 
*5 85 90 95 

ACT GTT TTC CAC GGT GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC 336 
Thr Val Phe His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly 

100 105 HO 

20 CCA ATC ACC CAA ATG TAC ACC AAT GTG GAT CAG GAC CTC GTC GGT TGG 384 

Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp 

115 120 125 

TCG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC ACC TGC GGC AGC 432 
25 Ser Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser 

130 135 140 

TCG GAC CTT TAT TTG GTC ACG AGA CAT GCT GAT GTC ATT CCG GTG CAC 480 
Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val His 
145 150 155 160 

CGG CGG GGC GAC AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC A 523 
Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro 

165 170 
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SEQ ID NO: 50 

SEQUENCE LENGTH: 921 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
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CLONE: N23-1 

CTG CTG TCG CCC GGG CCC ATC TCT TAC TTG AAG GGT TCC TCG GGT GGT 48 
Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly 

1 5 10 15 

CCG CTG CCT TGC CCC TCG GGC CGT GTT GTG GGC ATC TTC CGG GCT GCC 96 
iPro Leu Pro Cys Pro Ser Gly Arg Val Val Gly lie Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTT GAG 144 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 

35 40 45 

TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC ACG GAT AAC TCA ACC 192 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Thr 

50 55 60 

CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC CTA CAC GCT CCC 240 
Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala Pro 
65 70 75 80 

ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCG TAT GCG GCC CAA 288 
Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gin 
25 85 90 95 

GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT GCC ACT TTG GGC 336 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

100 105 110 

30 TTT GGG GCG TAC ATG TCC AAG GCA CAT GGT GTT GAC CCT AAC ATC AGA 384 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie Arg 

115 120 125 

ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC ACG TAC TCC ACC 432 
35 Thr Gly Val Arg Thr lie Thr Thr Gly Ala Pro He Thr Tyr Ser Thr 

130 135 140 

TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT GAC 480 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
40 145 150 155 160 

ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC TTG 528 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser. Thr Ser He Leu 

165 170 175 

45 GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT 576 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
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180 185 190 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAT 624 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 

195 200 205 

CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC TTC 672 
Pro Asri lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe 

210 215 220 

TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT CTC 720 
Tyr Gly Lys Ala He Pro Leu Glu Ala lie Lys Gly Gly Arg His Leu 
225 230 235 240 

ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG CTG 768 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

TCG GCC CTC GGA GTC AAC GCT GTA GGA TAT TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
20 260 265 270 

TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACA GAC GCT 864 
Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 
275 280 285 

25 CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCG GTG ATC GAC TGC AAC 912 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 

290 295 300 

ACA TGT GTC 
Thr Cys Val 
305 
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30 



SEQ ID NO: 51 

SEQUENCE LENGTH: 921 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANT I -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N23-2 
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CTG CTG TCG CCC GGG CCC ATC TCC TAC CTG AAG GGT TCC TCG GGT GGT 48 
Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly 

15 10 15 

CCG CTG CTT TGC CCC TCG GGC CAT GTT GTG GGC ATC TTC CGG GCT GCT 96 
Pro Leu Leu Cys Pro Ser Gly His Val Val Gly lie Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTA GAC TTT GTG CCC GTT GAG 144 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 

35 40 45 

TCT ATG GAA ACC ACT ATG CGG TCT CCG GTC TTC ACG GAT AAC TCA ACC 192 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Thr 

50 55 60 

CCC CCG GCC GTA CCG CAG TCA TTC CAA GTG GCC CAC CTA CAC GCT CCC 240 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
65 70 75 80 

ACT GGC AGC GGC AAA AGC ACC AAG GTG CCG GCT GCG TAT GCG GCC CAA 288 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 

85 90 95 

GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT GCC ACT TTG GGC 336 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

100 105 110 

TTT GGG GCG TAT ATG TCC AAG GCA CAT GGT GTT GAC CCT AAC ATC AGA 384 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie Arg 
30 115 120 125 

ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC ACG TAC TCC ACC 432 
Thr Gly Val Arg Thr lie Thr Thr Gly Ala Pro lie Thr Tyr Ser Thr 

130 135 140 

TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT GAC 480 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
145 150 155 160 

ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC TTG 528 
lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser lie Leu 

165 170 175 

GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT 576 
Gly lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
45 180 185 190 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAT 624 
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Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
195 200 205 

^ CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC TTC 672 

Pro Asn lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 

210 215 220 

TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT CTC 720 
Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly Gly Arg His Leu 
70 225 230 235 240 

ACT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG CTG 768 
Thr Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

15 TCG GCC CTC GGA GTC AAC GCT GTA GCA TAC TAG CGG GGT CTT GAT GTG 816 

Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 

260 265 270 

TCC GTC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACT GAC GCT 864 
20 Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 

27 5 280 285 

CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCA GTG ATC GAC TGC AAC 912 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 

25 290 295 300 

ACA TGT GTC 

Thr Cys Val 

305 



30 



SEQ ID NO: 52 

SEQUENCE LENGTH: 921 base pairs 

SEQUENCE TYPE: nucleic acid 
35 STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
40 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N23-3 

45 CTG CTG TCG CCC GGG CCC ATC TCT TAC TTG AAG GGC TCC TCG GGT GGT 

Leu Leu Ser Pro Gly Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
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70 



75 



20 



1 5 10 15 

CCG CTG CTT TGC CCC TCG GGC CAT GTT GTG GGC ATC TTC CGG GCT GCC 96 
Pro Leu Leu Cys Pro Ser Gly His Val Val Gly lie Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTT GAG 144 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 

35 40 45 

TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC GCG GAT AAC TCA ACC 192 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Ala Asp Asn Ser Thr 

50 55 60 

CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC CTA CAC GCT CCC 240 
Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala Pro 
65 70 75 80 

ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCG TAT GCG GCC CAA 288 
Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gin 

85 90 95 

GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT GCC ACT TTG GGC 336 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

100 105 110 

TTT GGG GCG TAC ATG TCC AAG GCA CAT GGT GTT GAC CCT AAC ATC AGA 384 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn He Arg 

115 120 125 

ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC GTC ACG TAC TCC ACC 432 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 

130 135 140 

TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT GAC 480 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
35 i45 150 155 160 

ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC TTG 528 
He lie He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser He Leu 

165 170 175 

40 GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT 576 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 

180 185 190 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAT 624 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
195 200 205 
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CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC TTC 672 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 

210 215 220 

TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT CTC 720 
Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly Gly Arg His Leu 
225 230 235 240 

ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG CTG 768 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

TCG GCC CTC GGA GTC AAT GCT GTA GCA TAT TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
75 260 265 270 

TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACA GAC GCT 864 
Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 
275 280 285 

2 0 CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCG GTG ATC GAC TGT AAC 912 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 

290 295 300 

ACA TGT GTC 921 
Thr Cys Val 
305 

SEQ ID NO: 53 

SEQUENCE LENGTH: 623 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI -SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N16-1 



25 



30 



35 



40 



45 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

i5 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATC GAG ACG 96 
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Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
5 Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr Pro Gly Glu 
™ 50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

75 GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC TCG GTT AGG 288 

Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 110 

CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 
Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

115 120 125 

CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 
His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 

130 135 140 

GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT CCA CCT CCA 480 
Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro Pro 
!45 150 155 160 

TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG AAG CCT ACG CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu 

165 170 175 

CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA GCC GTT CAG AAC GAG 576 
His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu 

180 185 190 

GTT ACC CTT ACA CAC CCC ATA ACC AAG TAC ATC ATG ACA TGC ATG TC 623 
Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Thr Cys Met 
195 200 205 

45 SEQ ID NO: 54 
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SEQUENCE LENGTH: 623 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 
5 TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
10 IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N16-2 



15 



20 



25 



30 



35 



40 



45 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

1 5 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATC GAG ACG 96 
Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
6 5 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC TCG GTT AGG 288 
Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 HO 

CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 
Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

115 120 125 

CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 
His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 
130 135 140 
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GTA GCA TAC CAG GCT 
Val Ala Tyr Gin Ala 
145 

TCG TGG GAT CAG ATG 
Ser Trp Asp Gin Met 

165 

CAC GGG CCA ACG CCC 
His Gly Pro Thr Pro 

180 

GTT ACC CTC ACA CAC 
Val Thr Leu Thr His 
195 



ACA GTG TGC GCC AGG 
Thr Val Cys Ala Arg 
150 

TGG AAG TGT CTC ATA 
Trp Lys Cys Leu lie 

170 

CTG TTG TAT AGG TTA 
Leu Leu Tyr Arg Leu 

185 

CCT ATA ACC AAG TAC 
Pro lie Thr Lys Tyr 
200 



GCC AAG GCT 
Ala Lys Ala 
155 

CGG CTG AAG 
Arg Leu Lys 

GGA GCC GTT 
Gly Ala Val 

ATC ATG ACA 
He Met Thr 
205 



CCA CCT 
Pro Pro 



CCT 
Pro 

CAG 
Gin 
190 
TGC 
Cys 



ACG 
Thr 
175 
AAC 
Asn 

ATG 
Met 



CCA 
Pro 
160 
CTA 
lieu 

GAG 
Glu 

TC 



480 



528 



576 



623 



SEQ ID NO: 55 

SEQUENCE LENGTH: 623 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N16-3 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

15 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATT GAG ACG 96 
Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGA GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
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Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG TCC GCC GAG ACC TCG GTT AGG 288 
5 Ala Gly Cys Ala Trp Tyr Glu Leu Thr Ser Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAC ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 
10 100 105 110 

CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 
Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala 
115 120 . 125 

75 CAC TTC TTG TCC CAG ACT AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 

His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 

130 135 140 

GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT CCA CCT CCA 480 
20 Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro Pro 

145 150 155 160 

TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG AAG CCT ACG CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu 

165 170 175 

CAC GGG CCA ACG CCC CTG TTG CAT AGG TTA GGA GCC GTT CAG AAC AAG 576 
His Gly Pro Thr Pro Leu Leu His Arg Leu Gly Ala Val Gin Asn Lys 

180 185 190 

GTT GCC CTC ACA CAC CCC ATA ACC AAG TAC ATC ATG ACA TGC ATG TC 623 
Val Ala Leu Thr His Pro lie Thr Lys Tyr lie Met Thr Cys Met 
195 200 205 
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SEQ ID NO: 56 

SEQUENCE LENGTH: 1280 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
45 CLONE: MX25026A-1 
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20 



TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu lie Ala Gin Ala Glu Ala Ala 

15 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr lie Lys 

35 40 45 

GGC AGG CTG GTC CCT GGG GCG GCA TAC GCT TTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 
75 50 55 60 

CTG CTC CTG CTC TTG ATG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Met Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp 
65 70 75 80 

CGG GAG ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

CTC TTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu He 

100 105 no 

TGG TGG TTG CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG 384 
Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 

H5 120 125 

TGG ATC CCC CCC CTC AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT 432 
Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 

130 135 140 

CTC ACA TGT GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC 480 
Leu Thr Cys Ala Val His Pro Glu Leu He Phe Asp He Thr Lys Leu 
145 150 155 160 

TTG CTC GCC ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC 528 
Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr 

165 170 175 

CAA ATG CCG TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG 576 
Gin Met Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Met Cys Met 

180 185 190 

TTG GTG CGG AAA GCC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG 624 
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Leu Val Arg Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Leu Met 

195 200 205 

AAG CTG GCT GCA CTG ACA GGT ACG TAG GTT TAT GAC CAT CTT ACT CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro 

210 215 220 

CTG CAG GAC TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT 720 
Leu Gin Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 2 30 235 240 

GAG CCC GTT GCC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACC TGG GGG 768 
Glu Pro Val Ala Phe Ser Asp Met Glu Thr Lys He lie Thr Trp Gly 

245 250 255 

GCA GAC ACT GCG GCG TGT GGG GAC ATC ATT TTG GGC CTA CCT GTC TCC 816 
Ala Asp Thr Ala Ala Cys Gly Asp lie lie Leu Gly Leu Pro Val Ser 

2 60 265 270 

GCC CGG AGG GGC AAC GAG ATA CTC CTC GGA CCG GCC GAT AGT TTT GAC 864 
Ala Arg Arg Gly Asn Glu lie Leu Leu Gly Pro Ala Asp Ser Phe Asp 

2 ?5 280 285 

GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAG 912 
Gly Gin Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser Gin Gin 

290 295 300 

ACG CGG GGC CTG CTT GGT TGC ATC ATC ACT AGC CTT ACG GGC CGG GAT 960 
Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp 
305 310 315 320 

AAG AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA 1008 
Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin 

325 330 335 

TCT TTC CTG GCG ACC TGT GTC AAC GGC GTG TGC TGG ACT GTT TTC CAC 1056 
Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val Phe His 

340 345 350 

GGC GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA 1104 
Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin 

355 360 365 

ATG TAC ACC AAT GTA GAT CAG GAC CTC GTC GGC TGG TCG GCG CCC CCC 1152 
Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Ser Ala Pro Pro 

370 375 380 

CGG GCG CGT TCC TTG ACA CCT TGC ACC TGC GGC AGC TCG GAC CTT TAT 1200 

Arg Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
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385 390 395 400 

TTG GTC ACG AGG CAT GCT GAT GTC ATT CCG GTG CAC CGG CGG GGC GAC 1248 

Leu Val Thr Arg His Ala Asp Val lie Pro Val His Arg Arg Gly Asp 

405 410 415 

AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC AT 1280 
Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro 

420 425 



SEQ ID NO: 57 

SEQUENCE LENGTH: 1280 base pairs 

SEQUENCE TYPE: nucleic acid 
75 STRAND EDNESS : double 

TOPOLOGY: linear 

ANTI -SENSE: No 

ORIGINAL SOURCE 
20 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: MX25026B-1 



45 



TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu lie Ala Gin Ala Glu Ala Ala 

1 5 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 
Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys 

35 40 45 

GGC AGG CTG GTC CCT GGG GCG GCA TAC GCT TTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 

50 55 60 

CTG CTC CTG CTC TTG ATG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Met Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp 
65 70 75 80 

CGG GAG ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 
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CTC TTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA 336 
Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu lie 

100 105 110 

5 TGG TGG TTG CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG 384 

Trp Trp Leu Gin Tyr Leu lie Thr Arg Ala Glu Ala His Leu Gin Val 

115 120 125 

TGG ATC CCC CCC CTC AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT 432 
10 Trp lie Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala lie He Leu 

130 135 140 

CTC ACA TGT GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC 480 
Leu Thr Cys Ala Val His Pro Glu Leu He Phe Asp He Thr Lys Leu 
14 5 150 155 160 

TTG CTC GCC ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC 528 
Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr 

165 170 175 

CAA ATG CCG TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG 576 
Gin Met Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Met Cys Met 

180 185 190 

TTG GTG CGG AAA GCC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG 624 
Leu Val Arg Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Leu Met 

195 200 205 

AAG CTG GCT GCA CTG ACA GGT ACG TAC GTT TAT GAC CAT CTT ACT CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro 

210 215 220 

CTG CAG GAC TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT 720 
Leu Gin Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GCC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACG TGG GGG 768 
Glu Pro Val Ala Phe Ser Asp Met Glu Thr Lys He He Thr Trp Gly 

245 250 255 

GCA GAG ACG GCG GCG TGT GGG GAC ATC ATC TCG GGT CTA CCC GTT TCC 816 
Ala Glu Thr Ala Ala Cys Gly Asp He He Ser Gly Leu Pro Val Ser 

260 265 270 

GCC CGA AGG GGG AGG GAG CTG CTT TTG GGG CCG GCC GAT AGT TTT GAC 864 
Ala Arg Arg Gly Arg Glu Leu Leu Leu Gly Pro Ala Asp Ser Phe Asp 
275 280 285 

4 5 GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAG 912 
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Gly Gin Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ser Gin Gin 

290 295 300 

ACG CGG GGC CTG CTT GGT TGC ATC ATC ACT AGC CTT ACG GGC CGG GAT 960 
Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp 
305 310 315 320 

AAG AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA 1008 
Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin 

325 330 335 

TCT TTC CTG GCG ACC TGT GTC AAC GGC GTG TGC TGG ACT GTT TTC CAC 1056 
Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val Phe His 

340 345 350 

75 GGC GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA 1104 

Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin 

355 360 365 

ATG TAC ACC AAT GTA GAT CAG GAC CTC GTC GGC TGG TCG GCG CCC CCC 1152 
20 Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Ser Ala Pro Pro 

370 375 380 

CGG GCG CGT TCC TTG ACA CCT TGC ACC TGC GGC AGC TCG GAC CTT TAT 1200 
Arg Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
385 390 395 400 

TTG GTC ACG AGG CAT GCT GAT GTC ATT CCG GTG CAC CGG CGG GGC GAC 1248 
Leu Val Thr Arg His Ala Asp Val He Pro Val His Arg Arg Gly Asp 

405 410 415 

AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC AT 1280 
Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro 

420 425 



25 



30 



SEQ ID NO: 58 

35 

SEQUENCE LENGTH: 1431 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 

40 

ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
45 CLONE: N16N15A-1 
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GGC TAT 

Gly Tyr 
1 

ACC CAA 
Thr Gin 

ACG ACC 
Thr Thr 

ACT GGT 
Thr Gly 
50 

CGG CCC 
Arg Pro 

65 
GCG GGC 
Ala Gly 

TTG CGG 
Leu Arg 

CTG GAG 
Leu Glu 



ACC 
Thr 

ACA 
Thr 

GTA 
Val 
35 
AGG 
Arg 



GGC GAC 
Gly Asp 
5 

GTC GAT 
Val Asp 

20 
CCC CAA 
Pro Gin 

GGC AGG 
Gly Arg 



TTC GAC TCA GTG 
Phe Asp Ser Val 



CAC 
His 

GTA 
Val 
145 
TCG 
Ser 



TTC 
Phe 
130 
GCA 
Ala 

TGG 
Trp 



45 



CAC GGG 
His Gly 

GTT ACC 



TCA GGC ATG 
Ser Gly Met 

TGT GCT TGG 
Cys Ala Trp 
85 

GCT TAC CTA 
Ala Tyr Leu 

100 
TTC TGG GAG 
Phe Trp Glu 
115 

TTG TCC CAG 
Leu Ser Gin 

TAC CAG GCT 
Tyr Gin Ala 

GAT CAG ATG 
Asp Gin Met 
165 

CCA ACG CCC 
Pro Thr Pro 

180 
CTT ACA CAC 



TTC AGC 
Phe Ser 

GAT GCG 
Asp Ala 

GGG GGC 
Gly Gly 
55 

TTC GAT 
Phe Asp 

70 
TAC GAG 
Tyr Glu 



TTG GAC 
Leu Asp 
25 

GTG TCG 
Val Ser 

40 
ATA TAC 
lie Tyr 

TCT TCG 
Ser Ser 

CTC ACG 
Leu Thr 



ATC GAC TGC AAC 
lie Asp Cys Asn 
10 

CCT ACT TTC ACC 
Pro Thr Phe Thr 



CGC TCG 
Arg Ser 

AGG TTT 

Arg Phe 



AAT ACA CCT GGG 
Asn Thr Pro Gly 

105 

AGC GTC TTC ACC 
Ser Val Phe Thr 
120 

ACC AAA CAG GCA 
Thr Lys Gin Ala 
135 

ACA GTG TGC GCC 
Thr Val Cys Ala 
150 

TGG AAG TGT CTC 
Trp Lys Cys Leu 

CTG TTG TAT AGG 
Leu Leu Tyr Arg 

185 

CCC ATA ACC AAG 



GTC 
Val 

CCC 
Pro 
90 
CTG 
Leu 



CTG 
Leu 
75 
GCC 
Ala 

CCC 
Pro 



CAG 
Gin 

GTA 
Val 
60 
TGT 
Cys 



CGG 
Arg 
45 
ACT 
Thr 

GAA 
Glu 



GAG ACC 
Glu Thr 

GTC TGC 
Val Cys 



GGC CTC 
Gly Leu 

GGA GAC 
Gly Asp 

AGG GCC 
Arg Ala 
155 
ATA CGG 
lie Arg 
170 

TTA GGA 
Leu Gly 



ACC 
Thr 

AAC 
Asn 
140 
AAG 
Lys 



CAC 
His 
125 
TTC 
Phe 

GCT 
Ala 



CTG AAG 
Leu Lys 

GCC GTT 
Ala Val 



TAC ATC ATG ACA 



ACA TGT GTC 48 
Thr Cys Val 
15 

ATC GAG ACG 96 
lie Glu Thr 
30 

CGA GGC AGG 144 
Arg Gly Arg 

CCA GGG GAA 192 
Pro Gly Glu 

TGT TAT GAC 240 
Cys Tyr Asp 
80 

TCG GTT AGG 288 
Ser Val Arg 
95 

CAG GAC CAT 336 

Gin Asp His 

110 

ATA GAT GCC 384 
lie Asp Ala 

CCC TAC CTG 432 
Pro Tyr Leu 

CCA CCT CCA 480 
Pro Pro Pro 
160 

CCT ACG CTA 528 
Pro Thr Leu 
175 

CAG AAC GAG 576 
Gin Asn Glu 
190 

TGC ATG TCG 624 
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20 



Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Thr Cys Met: Ser 

195 200 205 

GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA GGC GGG GTC 672 
5 Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val 

210 215 220 

CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC GTG GTC ATT 720 
Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He 
10 225 230 235 240 

GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT CCC GAC AGG 768 
Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val He Pro Asp Arg 

245 250 255 

75 GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC GCC TCG CAC 816 

Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His 

260 265 270 

CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA TTC AAG CAG 864 
Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin 

275 280 285 

AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG GAG GCT GCT 912 
Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala 

290 295 300 

GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC TTC TGG GCG 960 
Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala 
305 310 315 320 

AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA GCA GGC TTG 1008 
Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu 

325 330 335 

TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG GCA TTC ACA 1056 
Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr 

340 345 350 

GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC CTG TTT AAC 1104 
Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu Leu Phe Asn 

355 360 365 

ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC AGT GCC GCT 1152 
He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala 

370 375 380 

TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT GGC AGC ATA 1200 
Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He 
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385 390 395 400 

GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT GGA GCA GGG 1248 
Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly 
5 405 410 415 

GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGT GAC ATG CCC 1296 
Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Asp Met Pro 

420 425 430 

70 TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC CTC TCT CCT GGT 1344 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly 

435 440 445 

GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG CGT CGG CAT GTG 1392 
75 Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 

450 455 460 

GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 1431 
Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
465 470 475 
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SEQ ID NO: 59 

SEQUENCE LENGTH: 1431 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N16N15B-1 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

1 5 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATC GAG ACG 96 
Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 
45 35 40 45 
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ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly lie Tyr Arg Phe Val Thr Pro Gly Glu 
50 55 60 

S CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 

Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC TCG GTT AGG 288 
70 Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 

85 90 95 

TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC TGC CAG GAC CAT 336 
Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

100 105 110 

CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC ATA GAT GCC 384 
Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

115 120 125 

CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC TTC CCC TAC CTG 432 
His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 

130 135 140 

GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT CCA CCT CCA 480 
Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala Pro Pro Pro 
145 150 155 160 

TCG TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAG CCT ACG CTA 528 
Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu 

165 170 175 

CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA GCC GTT CAG AAC GAG 576 
His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu 

180 185 190 

GTT ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC ATG GCA TGC ATG TCG 624 
Val Thr Leu Thr His Pro He Thr Lys Phe He Met Ala Cys Met Ser 

195 200 205 

GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA GGC GGG GTC 672 
Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val 
40 210 215 220 

CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC GTG GTC ATT 720 
Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He 
225 230 235 240 

45 GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT CCC GAC AGG 768 
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20 



Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val He Pro Asp Arg 

245 250 255 

GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC GCC TCG CAC 816 
5 Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His 

260 265 270 

CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA TTC AAG CAG 864 
Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin 
70 275 280 285 

AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG GAG GCT GCT 912 
Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala 

290 295 300 

GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC TTC TGG GCG 960 
Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala 
305 310 315 320 

AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA GCA GGC TTG 1008 
Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu 

325 330 335 

TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG GCA TTC ACA 1056 
Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr 

340 345 350 

GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC CTG TTT AAC 1104 
Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu Leu Phe Asn 

355 360 365 

ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC AGT GCC GCT 1152 
He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala 

370 375 380 

TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT GGC AGC ATA 1200 
Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He 
385 390 395 400 

GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT GGA GCA GGG 1248 
Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly 

405 410 415 

GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGT GAC ATG CCC 1296 
Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Asp Met Pro 

420 425 430 

TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC CTC TCT CCT GGT 1344 
Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly 
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435 440 445 

GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG CGT CGG CAT GTG 1392 
Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 

450 455 460 

GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 1431 
Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
465 470 475 

SEQ ID NO: 60 

SEQUENCE LENGTH: 1431 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI -SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N16N15-1 



GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC ACA TGT GTC 48 
Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

15 10 15 

ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC ATC GAG ACG 96 
Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

20 25 30 

ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG CGA GGC AGG 144 
Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

35 40 45 

ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT CCA GGG GAA 192 
Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr Pro Gly Glu 

50 55 60 

CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA TGT TAT GAC 240 
Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 
65 70 75 80 

GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC TCG GTT AGG 288 
Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 
45 85 90 95 
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TTG CGG 

Leu Arg 

CTG GAG 
Leu Glu 



CAC 
His 

GTA 
Val 
145 
TCG 
Ser 



TTC 
Phe 
130 
GCA 
Ala 

TGG 
Trp 



45 



CAC GGG 
His Gly 

GTT ACC 
Val Thr 

GCT GAC 
Ala Asp 
210 
CTC GCG 
Leu Ala 
225 

GTG GGC 
Val Gly 

GAA GTT 
Glu Val 

CTC CCT 
Leu Pro 

AAG GCG 



GCT 
Ala 

TTC 
Phe 
115 
TTG 
Leu 

TAC 
Tyr 

GAT 
Asp 

CCA 
Pro 

CTC 
Leu 
195 
CTA 
Leu 

GCT 
Ala 

AGG 
Arg 

CTC 
Leu 

TAC 
Tyr 
275 
CTC 



TAC CTA 
Tyr Leu 
100 

TGG GAG 
Trp Glu 

TCC CAG 
Ser Gin 

CAG GCT 
Gin Ala 



CAG 
Gin 

ACG 
Thr 
180 
ACA 
Thr 



ATG 
Met 
165 
CCC 
Pro 

CAC 
His 



GAG GTC 
Glu Val 

CTG GCC 
Leu Ala 



ATC 
He 

TAC 
Tyr 
260 
ATC 
He 



ATC 
He 
245 
CAA 
Gin 

GAA 
Glu 



GGT TTG 



AAT ACA CCT 

Asn Thr Pro 

AGC GTC TTC 
Ser Val Phe 
120 

ACC AAA CAG 
Thr Lys Gin 

135 
ACA GTG TGC 
Thr Val Cys 
150 

TGG AAG TGT 
Trp Lys Cys 

CTG TTG TAT 
Leu Leu Tyr 

CCC ATA ACC 
Pro He Thr 
200 

GTC ACT AGC 
Val Thr Ser 

215 
GCG TAC TGC 
Ala Tyr Cys 
230 

TTG TCC GGG 
Leu Ser Gly 

GAG TTC GAT 
Glu Phe Asp 

CAA GGA ATG 
Gin Gly Met 
280 

CTG CAA ACA 



GGG CTG CCC 
Gly Leu Pro 
105 

ACC GGC CTC 
Thr Gly Leu 

GCA GGA GAC 
Ala Gly Asp 



GTC TGC 
Val Cys 



GCC 
Ala 

CTC 
Leu 

AGG 
Arg 
185 
AAG 
Lys 



AGG GCC 
Arg Ala 
155 
ATA CGG 
He Arg 
170 

TTA GGA 
Leu Gly 

TTC ATC 
Phe He 



ACC 
Thr 

AAC 
Asn 
140 
AAG 
Lys 



CAC 
His 
125 
TTC 
Phe 

GCT 
Ala 



CTG AAG 
Leu Lys 

GCC GTT 
Ala Val 



ACT TGG GTG 
Thr Trp Val 



CTG 
Leu 

AGG 
Arg 

GAA 
Glu 
265 
CAG 
Gin 



ACA ACG 
Thr Thr 
235 
CCG GCC 
Pro Ala 
250 

ATG GAA 
Met Glu 



ATG 
Met 

CTG 
Leu 
220 
GGC 
Gly 



GCA 
Ala 
205 
GTA 
Val 

AGC 
Ser 



CAG GAC CAT 
Gin Asp His 
110 

ATA GAT GCC 
He Asp Ala 

CCC TAC CTG 
Pro Tyr Leu 

CCA CCT CCA 
Pro Pro Pro 
160 

CCT ACG CTA 
Pro Thr Leu 

175 
CAG AAC GAG 
Gin Asn Glu 
190 

TGC ATG TCG 
Cys Met Ser 



336 



GTT ATT 
Val He 

GAG TGC 
Glu Cys 



CTC GCC 
Leu Ala 



GGC GGG GTC 
Gly Gly Val 

GTG GTC ATT 
Val Val He 
240 

CCC GAC AGG 
Pro Asp Arg 

255 
GCC TCG CAC 
Ala Ser His 
270 

TTC AAG CAG 
Phe Lys Gin 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



GAG CAA 
Glu Gin 
285 

GCC ACC AAG CAA GCG GAG GCT GCT 912 
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Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala 

290 295 300 

GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC TTC TGG GCG 960 
5 Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala 

305 310 315 320 

AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA GCA GGC TTG 1008 
Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu 
10 325 330 335 

TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG GCA TTC ACA 1056 
Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr 

340 345 350 

75 GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC CTG TTT AAC 1104 

Ala Ser lie Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu Leu Phe Asn 

355 360 365 

ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC AGT GCC GCT 1152 
20 Ile Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala 

370 375 380 

TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT GGC AGC ATA 1200 
Ser Ala Phe Val Gly Ala Gly Ile Ala Gly Ala Ala Val Gly Ser Ile 
385 390 395 400 

GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT GGA GCA GGG 1248 
Gly Leu Gly Lys Val Leu Val Asp Ile Leu Ala Gly Tyr Gly Ala Gly 

405 410 415 

GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGT GAC ATG CCC 1296 
Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Asp Met Pro 

420 425 430 

TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC CTC TCT CCT GGT 1344 
Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala Ile Leu Ser Pro Gly 

435 440 445 

GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG CGT CGG CAT GTG 1392 
Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val 

450 455 460 

GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 1431 
Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
465 470 475 

45 SEQ ID NO: 61 



25 



30 



35 



40 



50 



55 



177 



EP 0 518 313 A2 



SEQUENCE LENGTH: 2304 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 
5 TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
10 IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N23N15A-1 



75 



20 



CTG CTG TCG CCC GGG CCC ATC TCT TAG TTG AAG GGT TCC TCG GGT GGT 48 
Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly. Ser Ser Gly Gly 

15 10 15 

CCG CTG CCT TGC CCC TCG GGC CGT GTT GTG GGC ATC TTC CGG GCT GCC 96 
Pro Leu Pro Cys Pro Ser Gly Arg Val Val Gly lie Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTT GAG 144 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
35 40 45 

25 TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC ACG GAT AAC TCA ACC 192 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Thr 

50 55 60 

CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC CTA CAC GCT CCC 240 
30 Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala Pro 

65 70 75 80 

ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCG TAT GCG GCC CAA 288 
Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gin 
35 85 90 95 

GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT GCC ACT TTG GGC 336 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

100 105 110 

40 TTT GGG GCG TAC ATG TCC AAG GCA CAT GGT GTT GAC CCT AAC ATC AGA 384 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn He Arg 

115 120 125 

ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC ACG TAC TCC ACC 432 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro He Thr Tyr Ser Thr 
130 135 140 
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TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT GAC 480 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
145 150 155 160 

5 ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC TTG 528 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser He Leu 

165 170 175 

GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT 576 
70 Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 

180 185 190 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAT 624 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 

195 200 205 

CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC TTC 672 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 

210 215 220 

TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT CTC 720 
Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly Gly Arg His Leu 
225 230 235 240 

ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG CTG 768 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

TCG GCC CTC GGA GTC AAC GCT GTA GCA TAT TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 

260 265 270 

TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACA GAC GCT 864 
Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 

275 280 285 

CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCG GTG ATC GAC TGC AAC 912 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 

290 295 300 

ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC 960 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
40 305 310 315 320 

ATC GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG 1008 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 

325 330 335 

45 CGA GGC AGG ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT 1056 
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Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly lie Tyr Arg Phe Val Thr 

340 345 350 

CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA 1104 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 

355 360 365 

TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC 1152 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 

370 375 380 

TCG GTT AGG TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC TGC 1200 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
385 390 395 400 

CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC 1248 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 

405 410 415 

ATA GAT GCC CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC TTC 1296 
lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 

420 425 . 430 

CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT 1344 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala 

435 440 445 

CCA CCT CCA TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG AAG 1392 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 

450 455 460 

CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA GCC GTT 1440 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
465 470 475 480 

CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC ATG GCA 1488 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Phe He Met Ala 

485 490 495 

TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA 1536 
Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val 

500 505 510 

GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC 1584 
Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser 

515 520 525 

GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT 1632 
Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val He 



180 



EP 0 518 313 A2 



10 



15 



20 



530 535 540 

CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC 1680 
Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys 
545 550 555 560 

GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA 1728 
Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin 

565 570 575 

TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG 1776 
Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala 

580 585 590 

GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC 1824 
Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr 

595 600 605 

TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA 1872 
Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly He Gin Tyr Leu 

610 615 620 

GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG 1920 
Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met 
625 630 635 640 

GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC 1968 
Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu 

645 650 655 

CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC 2016 
Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro 

660 665 670 

AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT 2064 
Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val 
675 680 685 

35 GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT 2112 

Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr 

690 695 700 

GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGT 2160 
40 Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly 
705 710 715 720 

GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC CTC 2208 
Asp Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu 
45 725 730 735 
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TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG CGT 2256 
Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Aia Ala He Leu Arg 

740 745 750 

5 CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGS ATG AAC CGG CTG 2304 

Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
755 760 765 

10 SEQ ID NO: 62 

SEQUENCE LENGTH: 2304 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 
75 TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
20 IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N23N15B-1 
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CTG CTG TCG CCC GGG CCC ATC TCT TAC TTG AAG GGT? TCC TCG GGT GGT 48 
Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly 

1 5 10 15 

CCG CTG CCT TGC CCC TCG GGC CGT GTT GTG GGC ATC TTC CGG GCT GCC 96 
Pro Leu Pro Cys Pro Ser Gly Arg Val Val Gly lies Phe Arg Ala Ala 

20 25 30 

GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TUT GTG CCC GTT GAG 144 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phfe Val Pro Val Glu 

35 40 45 

TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC AO? GAT AAC TCA ACC 192 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr- Asp Asn Ser Thr 

50 55 60 

CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC: CTA CAC GCT CCC 240 
Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala Pro 
6 5 70 75 80 

ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCfr- TAT GCG GCC CAA 288 
Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gin 

85 90 95 

GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GOV GCC ACT TTG GGC 336 
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Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

100 105 HO 

TTT GGG GCG TAG ATG TCC AAG GCA CAT GGT GTT GAG CCT AAC ATC AGA 384 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie Arg 

115 120 125 

ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC ACG TAG TCC ACC 432 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro He Thr Tyr Ser Thr 

130 135 140 

TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT GAC 480 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
I 45 150 155 160 

ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC TTG 528 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser He Leu 

165 170 175 

GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT 576 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 

180 185 190 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAT 624 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
25 195 200 205 

CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC TTC 672 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 

210 215 220 

TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT CTC 720 
Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly Gly Arg His Leu 
22 5 230 235 240 

ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG CTG 768 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 

245 250 255 

TCG GCC CTC GGA GTC AAC GCT GTA GCA TAT TAC CGG GGT CTT GAT GTG 816 
Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 

260 265 270 

TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACA GAC GCT 864 
Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 

275 280 285 

CTA ATG ACG GGC TAT ACC GGC GAC TTC GAC TCA GTG ATC GAC TGC AAC 912 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
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290 295 300 

ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC ACC 960 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
5 305 310 315 320 

ATC GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG CGG 1008 
lie Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 

325 330 335 

10 CGA GGC AGG ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA ACT 1056 

Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val Thr 

340 345 350 

CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT GAA 1104 
75 Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 

355 360 365 

TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC 1152 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 

370 375 380 

TCG GTT AGG TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC TGC 1200 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
385 390 395 400 

CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC CAC 1248 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 

405 410 415 

ATA GAT GCC CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC TTC 1296 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 

420 425 430 

CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG GCT 1344 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys Ala 

435 440 445 

CCA CCT CCA TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG AAG 1392 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 

450 455 460 

CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA GCC GTT 1440 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
4 65 470 475 480 

CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC ATG GCA 1488 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Phe He Jlet Ala 
45 485 490 495 
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TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG GTA 1536 
Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val 

500 505 510 

GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC AGC 1584 
Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser 

515 520 525 

GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT ATT 1632 
Val Val lie Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val He 

530 535 540 

CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG TGC 1680 
Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys 
545 550 555 560 

GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG CAA 1728 
Ala Ser His Leu Pro Tyr He Glu Gin. Gly Met Gin Leu Ala Glu Gin 

565 570 575 

TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA GCG 1776 
Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala 

580 585 590 

GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC 1824 
Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr 

595 600 605 

TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG TAC TTA 1872 
Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu 

610 615 620 

GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG ATG 1920 
Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met 
62 5 630 635 640 

GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC CTC 1968 
Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr Leu 

645 650 655 

CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC CCC 2016 
Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro 

660 665 670 

AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT GTT 2064 
Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val 
675 680 685 

45 GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT TAT 2112 
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Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr 

690 695 700 

GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGT 2160 
5 Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val M t Ser Gly 

705 710 715 720 

GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC CTC 2208 
Asp Met Pro Ser Thr.Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu 
w 725 730 735 

TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG CGT 2256 
Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg 

740 745 750 

CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 2304 
Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
755 760 765 
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SEQ ID NO:63 

SEQUENCE LENGTH: 3564 base pairs 

SEQUENCE TYPE: nucleic acid 

STRAND EDN ESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: MX25N15-1 



TGT GCC TGG TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC 48 
Cys Ala Trp Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala 

15 10 15 

TTG GAG AAC CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT 96 
Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His 

20 25 30 

40 GGC ATC CTC TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA 144 

Gly He Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys 

35 40 45 

GGC AGG CTG GTC CCT GGG GCG GCA TAC GCT TTC TAT GGC GTA TGG CCG 192 
Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 
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50 55 60 

CTG CTC CTG CTC TTG ATG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC 240 
Leu Leu Leu Leu Leu Met Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp 
5 65 70 75 80 

CGG GAG ATG GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA 288 
Arg Glu Met Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val 

85 90 95 

70 CTC TTG ACC TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA 336 

Leu Leu Thr Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu He 

100 105 HO 

TGG TGG TTG CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG 384 
75 Trp Trp Leu Gin Tyr Leu He Thr Arg Ala Glu Ala His Leu Gin Val 

115 120 125 

TGG ATC CCC CCC CTC AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT 432 
Trp He Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala He He Leu 

130 135 140 

CTC ACA TGT GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC 480 
Leu Thr Cys Ala Val His Pro Glu Leu He Phe Asp He Thr Lys Leu 
145 150 155 160 

TTG CTC GCC ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC 528 
Leu Leu Ala He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr 

165 170 175 

CAA ATG CCG TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG 576 
Gin Met Pro Tyr Phe Val Arg Ala Gin Gly Leu He Arg Met Cys Met 

180 185 190 

TTG GTG CGG AAA GCC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG 624 
Leu Val Arg Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Leu Met 

195 200 205 

AAG CTG GCT GCA CTG ACA GGT ACG TAC GTT TAT GAC CAT CTT ACT CCA 672 
Lys Leu Ala Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro 

210 215 220 

CTG CAG GAC TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT 720 
Leu Gin Asp Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val 
225 230 235 240 

GAG CCC GTT GCC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACC TGG GGG 768 
Glu Pro Val Ala Phe Ser Asp Met Glu Thr Lys lie He Thr Trp Gly 

245 250 255 
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GCA GAC ACT GCG GCG TGT GGG GAC ATC ATT TTG GGC CTA CCT GTC TCC 816 
Ala Asp Thr Ala Ala Cys Gly Asp lie lie Leu Gly Leu Pro Val Ser 

260 265 270 

GCC CGG AGG GGC AAC GAG ATA CTC CTC GGA CCG GCC GAT AGT TTT GAC 864 
Ala Arg Arg Gly Asn Glu lie Leu Leu Gly Pro Ala Asp Ser Phe Asp 

275 280 285 

GGG CAG GGG TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAG 912 
Gly Gin Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ser Gin Gin 

290 295 300 

ACG CGG GGC CTG CTT GGT TGC ATC ATC ACT AGC CTT ACG GGC CGG GAT 960 
Thr Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp 
305 310 315 320 

AAG AAC CAG GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA 1008 
Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin 

325 330 335 

TCT TTC CTG GCG ACC TGT GTC AAC GGC GTG TGC TGG ACT GTT TTC CAC 1056 
Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val Phe His 

340 345 350 

GGC GCC GGC TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA 1104 
Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin 

355 360 365 

ATG TAC ACC AAT GTA GAT CAG GAC CTC GTC GGC TGG TCG GCG CCC CCC 1152 
Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Ser Ala Pro Pro 

370 375 380 

CGG GCG CGT TCC TTG ACA CCT TGC ACC TGC GGC AGC TCG GAC CTT TAT 1200 
Arg Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
385 390 395 400 

TTG GTC ACG AGG CAT GCT GAT GTC ATT CCG GTG CAC CGG CGG GGC GAC 1248 
Leu Val Thr Arg His Ala Asp Val He Pro Val His Arg Arg Gly Asp 

405 410 415 

AGC AGG GGG AGC CTC CTC TCC CCC GGG CCC ATC TCT TAC TTG AAG GGT 1296 
Ser Arg Gly Ser Leu Leu Ser Pro Gly Pro lie Ser Tyr Leu Lys Gly 

420 425 430 

TCC TCG GGT GGT CCG CTG CCT TGC CCC TCG GGC CGT GTT GTG GGC ATC 1344 
Ser Ser Gly Gly Pro Leu Pro Cys Pro Ser Gly Arg Val Val Gly He 
435 440 445 

45 TTC CGG GCT GCC GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT 1392 
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Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe 

450 455 460 

GTG CCC GTT GAG TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC ACG 1440 
5 Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr 

465 470 475 480 

GAT AAC TCA ACC CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC 1488 
Asp Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His 
10 485 490 495 

CTA CAC GCT CCC ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCG 1536 
Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala 

500 505 510 

j 5 TAT GCG GCC CAA GGG TAG AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT 1584 

Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala 

515 520 525 

GCC ACT TTG GGC TTT GGG GCG TAG ATG TCC AAG GCA CAT GGT GTT GAC 1632 
Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp 

530 535 540 

CCT AAC ATC AGA ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC 1680 
Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ala Pro lie 
545 550 555 560 

ACG TAC TCC ACC TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG 1728 
Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly 

565 570 575 

GGT GCC TAT GAC ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG 1776 
Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ser 

580 585 590 

ACT TCC ATC TTG GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT 1824 
Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala 

595 600 605 

GGA GCG CGC CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC 1872 
Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val 

610 615 620 

ACC GTG CCG CAT CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA 1920 
Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly 
625 630 635 640 

GAG ATC CCC TTC TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG 1968 
Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly 
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20 



645 650 655 

GGG AGG CAT CTC ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC 2016 
Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu 
5 660 665 670 

GCT GCG AAG CTG TCG GCC CTC GGA GTC AAC GCT GTA GCA TAT TAG CGG 2064 
Ala Ala Lys Leu Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg 
675 680 685 

W GGT CTT GAT GTG TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG 2112 

Gly Leu Asp Val Ser He He Pro Thr Ser Gly Asp Val Val Val Val 

690 695 700 

GCA ACA GAC GCT CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCG GTG 2160 
75 Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val 

705 710 715 720 

ATC GAC TGC AAC ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC 2208 
He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp 

725 730 735 

CCT ACT TTC ACC ATC GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG 2256 
Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser 

740 745 750 

CGC TCG CAG CGG CGA GGC AGG ACT GGT AGG GGC AGG GGG GGC ATA TAC 2304 
Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly He Tyr 

755 760 765 

AGG TTT GTA ACT CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG 2352 
Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser 

770 775 780 

GTC CTG TGT GAA TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG 2400 
Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr 
785 790 795 800 

CCC GCC GAG ACC TCG GTT AGG TTG CGG GCT TAC CTA AAT ACA CCT GGG 2448 
Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly 

805 810 815 

CTG CCC GTC TGC CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC 2496 
Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr 

820 825 830 

GGC CTC ACC CAC ATA GAT GCC CAC TTC TTG TCC CAG ACC AAA CAG GCA 2544 
Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala 
45 835 840 845 
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GGA GAC AAC TTC CCC TAC CTG GTA GCA TAG CAG GCT ACA GTG TGC GCC 2592 
Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala 

850 855 860 

AGG GCC AAG GCT CCA CCT CCA TCG TGG GAT CAG ATG TGG AAG TGT CTC 2640 
Arg Ala Lys Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu 
865 870 875 880 

ATA CGG CTG AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG 2688 
lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg 

885 890 895 

TTA GGA GCC GTT CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA ACC AAG 2736 
Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys 

900 905 910 

TTC ATC ATG GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT 2784 
Phe He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr 

915 920 925 

TGG GTG CTG GTA GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG 2832 
Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu 

930 935 940 

ACA ACG GGC AGC GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG 2880 
Thr Thr Gly Ser Val Val He Val Gly Arg He He Leu Ser Gly Arg 
945 950 955 960 

CCG GCC GTT ATT CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA 2928 
Pro Ala Val He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu 

965 970 975 

ATG GAA GAG TGC GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG 2976 
Met Glu Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin 

980 985 990 

CTC GCC GAG CAA TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC 3024 
Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala 

995 1000 1005 

ACC AAG CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA 3072 
Thr Lys Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg 

1010 1015 1020 

GCC CTT GAG ACC TTC TGG GCG AAG CAC ATG TGG AAT TTC ATC AGC GGG 3120 
Ala Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly 
iOSS 1030 1035 1040 

ATA CAG TAC TTA GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA 3168 
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He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He 

1045 1050 1055 

GCA TCA CTG ATG GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC 3216 
5 Ala Ser Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr 

1060 1065 1070 

CAA TAT ACC CTC CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA 3264 
Gin Tyr Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin 
10 1075 1080 1085 

CTC GCC CCC CCC AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT 3312 
Leu Ala Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala 
1090 1095 1100 

75 GGC GCG GCT GTT GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT 3360 

Gly Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He 
H05 1110 1H5 H20 

CTG GCG GGT TAT GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG 3408 
Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys 

1125 1130 1135 

GTC ATG AGC GGT GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC 3456 
Val Met Ser Gly Asp Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu 

1140 1145 1150 

CCC GCC ATC CTC TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA 3504 
Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala 

1155 1160 1165 

GCA ATA CTG CGT CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG 3552 
Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp 

1170 1175 1180 

ATG AAC CGG CTG 3554 
Met Asn Arg Leu 
1185 



20 



25 



30 



35 



40 



SEQ ID NO: 64 

SEQUENCE LENGTH: 818 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
45 ORIGINAL SOURCE 
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20 



ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N22-1, N22-3, H22-8, H22-9 

5 

GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
1 5 10 15 

10 ATA GCG TTY GCY TCG CGG GGY AAC CAY GTC TCC CCC ACG CAY TAT GTG 95 

lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAR AGC GAC GCC GCR GCG CGY GTC ACC CAG ATC CTC TCC ARC CTY 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser Xaa Leu 

35 40 45 

ACC ATC ACT CAG YTG YTG AAG AGG CTY CAC CAG TGG ATT RAT GAK GAC 191 
Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asx Xac Asp 

50 55 60 

TGC TCC ACG CCA TGY TCY GGY TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTR TTG RST GAY TKC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Xad Asp Xae Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

CTG CCG CGG YTA CCG GGR GTC CCT TTY YTY TCA TGC CAR CGT GGG TAG 335 
Leu Pro Arg Leu Pro Gly Val Pro Phe Xaf Ser Cys Gin Arg Gly Tyr 

100 105 110 

AAG GGR GTY TGG CGG GGA GAY GGC ATC ATG YAD ACC ACC TGC CCA TGY 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Xag Thr Thr Cys Pro Cys 

115 120 125 

GGA GCA CAA ATC RCC GGA CAT GTC AAA AAY GGT TCY ATG AGG ATC RYT 431 
Gly Ala Gin He Xah Gly His Val Lys Asn Gly Ser Met Arg He Xai 

130 135 140 

GGS CYY AGA ACC TGT AGC AAC ACG TGG CRC GGA ACR TTY CCC ATC AAC 479 
Gly Xaj Arg Thr Cys Ser Asn Thr Trp Xak Gly Thr Phe Pro He Asn 

145 150 155 

GCG TAC ACC ACA GGC CCC TGC ACA CCC TCY CCR GCG CCR AAC TAY TCY 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
45 160 165 170 175 
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ARG GCG TTR TGG 
Xal Ala Leu Trp 

GTG GGG GAY TTC 
Val Gly Asp Phe 

195 

TGC CCA TGC CAG 
Cys Pro Cys Gin 
210 

GTR CGG CTR CRC 
Val Arg Leu Xak 
225 

GAG GTC ACA TTC 
Glu Val Thr Phe 
240 

CTM CCA TGY GAG 
Leu Pro Cys Glu 

ACC 
Thr 



CGG GTR GCY RYT GAG 
Arg Val Ala Xam Glu 
180 

CAC TAC GTG ACG GGC 
His Tyr Val Thr Gly 

200 

GTY CCG GCC CCC GAA 
Val Pro Ala Pro Glu 

215 

AGR TAC GCT CCG GCG 
Arg Tyr Ala Pro Ala 
230 

CAG GTC GGG CTC AAC 
Gin Val Gly Leu Asn 
245 

CCC GAA CCG GAT GTA 
Pro Glu Pro Asp Val 
260 



GAG TAT GTG GAG 
Glu Tyr Val Glu 
185 

ATG ACC ACT GAC 
Met Thr Thr Asp 

TTY TTC ACR GAR 
Phe Phe Thr Glu 

220 

TGC AAA CCT CTC 
Cys Lys Pro Leu 
235 

CAA TWY MCG GTT 
Gin Xao Xap Val 
250 

RYR GTG GTC ACC 
Xaq Val Val Thr 
265 



GTC ACG CGG 575 
Val Thr Arg 
190 

AAC KTR AAA 623 
Asn Xan Lys 
205 

TTG GAT GGG 671 
Leu Asp Gly 

CTR CGG GAT 719 
Leu Arg Asp 

GGG TCR CAG 767 
Gly Ser Gin 
255 

TCC ATG CTC 815 
Ser Met Leu 
270 

818 



Y : 


C 


or T 


R : 


A or 


G 


M 


: A or 


C 


K 


: G or T 




S : 


G 


or C 


W : 


A or 


T 


D 


: G or 


T or A 








Xaa 




Asn or 


Ser 




Asx ' 


: Asn 


or Asp 




Xac 


: Glu or 


Asp 


Xad 




Ala or 


Ser 




Xae ; 


t Cys 


or Phe 




Xaf 


: Phe or 


Leu 


Xag 




Tyr or 


Gin or 


His 


Xah i 


i Thr 


or Ala 




Xai 


s Val or 


Thr 


Xaj 




Pro or 


Leu 




Xak ; 


c His 


or Arg 




Xal 


: Arg or 


Lys 


Xam 




lie or 


Ala 




Xan : 


s Val 


or Leu 




Xao 


: Tyr or 


Phe 


Xap 




Thr or 


Pro 




Xaq : 


i Thr 


or Met 


or Ala 









SEQ ID NO: 65 

SEQUENCE LENGTH: 311 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N17-1, N17-2, N17-3, H17-1, H17-3 

TGT GAG CCC GAA CCG GAT GTA ACA GTG STC ACY TCC ATG CTC ACC GAC 48 
Cys Glu Pro Glu Pro Asp Val Thr Val Xaa Thr Ser Met Leu Thr Asp 

1 5 10 15 

CCC TCC CAC ATY ACA GCA GAG RCG GCT RRG CGT AGG CTG RCC AGA GGG 96 
Pro Ser His He Thr Ala Glu Xab Ala Xac Arg Arg Leu Xab Arg Gly 

20 25 30 

TCT CCY CCT YCY TYG RCC AGY TCT TCA GCT AGY CAG TTG TCT GCG CYH 144 
Ser Pro Pro Xad Xae Xab Ser Ser Ser Ala Ser Gin Leu Ser Ala Xaf 

35 40 45 

TCY YYG MAG GCR ACA TGY ACT ACC CAT CAD GRC KCC CCR GAC RCT GAC 192 
Ser Xae Xag Ala Thr Cys Thr Thr His Xah Xai Xaj Pro Asp Xab Asp 

50 55 60 

CTC ATC GAG GCC AAC CTC CTR TGG CGG CAG GAG ATG GGM GGR AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
65 70 75 80 

ACC CGY GTG GAG TYA GAG ARC AAG RTA GTR ATT CTR GAC TCT TYY GAM 288 
Thr Arg Val Glu Xae Glu Xak Lys Xal Val He Leu Asp Ser Xam Xan 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 311 
Pro Leu Arg Ala Glu Glu Asp 

100 



Y : 


C 


or T 


R : A or 


G 


M 


: A or 


C 


K 


: G or T 


S : 


G 


or C 


H : A or 


T or 


C D 


: G or 


T or A 






Xaa 




Val or 


Leu 


Xab 


: Ala 


or Thr 








Xac 




Arg or 


Lys or Gly 


Xad 


: Pro 


or Ser 




Xae 


: Ser or 


Xaf 




Pro or 


Leu 


Xag 


: Gin 


or Lys 




Xah 


: Gin or 


Xai 




Gly or 


Asp 


Xaj 


: Ala 


or Ser 




Xak 


: Asn or 


Xal 




He or 


Val 


Xam 


: Phe 


or Ser 




Xan 


: Glu or 



SEQ ID NO: 66 

SEQUENCE LENGTH: 740 base pairs 
SEQUENCE TYPE: nucleic acid 
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35 



40 



STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 028-1, 028-2, 028-4 



GTG GTA GTC CTG GAC TCG TTG GAS CCG CTT CRA GCG RAG GAA GRT GAG 48 
Val Val Val Leu Asp Ser Leu Xaa Pro Leu Xab Ala Xac Glu Xad Glu 
15 10 15 

75 AGG GAA GTG TCC GTT GCG GCG GAG ATC CTG CGR AAG ACC ARG AAA TTC 96 

Arg Glu Val Ser Val Ala Ala Glu lie Leu Arg Lys Thr Xae Lys Phe 

20 25 30 

CCC GCA GCG ATG CCC GTA TGG GCA CGC CCG GAC. TAG AAC CCA CCA TTA 144 
Pro Ala Ala Met Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 

35 40 45 

CTA GAG TCT TGG AAG AAC CCG GAC TAC GTC CCT CCR GTG GTA CAC GGG 192 
Leu Glu Ser Trp Lys Asn Pro Asp Tyr Val Pro Pro Val Val His Gly 

50 55 60 

TGC CCA TTG CCG CCT AYC AAG GCC CCT CCA ATA CCA CCT CCA CGR AGA 240 
Cys Pro Leu Pro Pro Xaf Lys Ala Pro Pro lie Pro Pro Pro Arg Arg 
65 70 75 80 

AAG AGR ACG GTT GYC CTG ACA GAA TCC WCC GTG TCC TCT GCC TTG GCG 288 
Lys Arg Thr Val Xag Leu Thr Glu Ser Xah Val Ser Ser Ala Leu Ala 

85 90 95 

GAG CTT GCT ACA AAG ACC TTT GGC AGT TCC GGA TCG TCG GCC GTC GAC 336 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Gly Ser Ser Ala Val Asp 

100 105 110 

AGC GGC ACG GCG ACY GGC CCT CCT GAC CAG GCC TCC GCC GAA GGA GAT 384 
Ser Gly Thr Ala Thr Gly Pro Pro Asp Gin Ala Ser Ala Glu Gly Asp 

115 120 125 

GCA GGA TCC GAC GCT GAG TCG TAC TCC TCC ATG CCC CCC CTT GAG GGA 432 
Ala Gly Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 

130 135 140 

GAG CCG GGG GAC CCY GAT CTC AGC GAC GGG TCT TGG TCT ACY GTA AGC 480 
Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 
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145 










150 










155 










160 






GAG 


GAG 


GCC 


RGC 


GAG 


GAC 


GTC 


GTC 


TGC 


TGC 


TCG 


ATG 


TCC 


TAC 


ACA 


TGG 


528 




Glu 


Glu 


Ala 


Xai 


Glu 


Asp 


Val 


Val 


Cys 


Cys 


Ser 


Met 


Ser 


Tyr 


Thr 


Trp 




5 










165 










170 










175 








ACA 


GGC 


GCC 


TTA 


ATT 


ACA 


CCA 


TGC 


RCC 


GCG 


GAG 


GAG 


AGC 


AAG 


CTG 


CCC 


576 




Thr 


Gly 


Ala 


Leu 
180 


He 


Thr 


Pro 


Cys 


Xaj 
185 


Ala 


Glu 


Glu 


Ser 


Lys 
190 


Leu 


Pro 




W 


ATT 


AAT 


GCG 


CTG 


AGC 


AAC 


YCT 


TTG 


CTG 


CGY 


CAC 


CAC 


AAC 


ATG 


GTC 


TAT 


624 




He 


Asn 


Ala 
195 


Leu 


Ser 


Asn 


Xak 


Leu 
200 


Leu 


Arg 


His 


His 


Asn 
205 


Met 


Val 


Tyr 






GCC 


ACA 


ACA 


TCC 


CGC 


AGC 


GCA 


AGC 


CAG 


CGG 


CAG 


AAA 


AAG 


GTC 


ACA 


TTT 


672 


15 


Ala 


Thr 


Thr 


Ser 


Arg 


Ser 


Ala 


Ser Gin Arg 


Gin 


Lys 


Lys 


Val 


Thr 


Phe 








210 










215 










220 














GAC 


AGA 


CTG 


CAA 


GTC 


CTG 


GAT 


GAC 


CAC 


TAC 


CGG 


GAC 


GTG 


CTC 


AAG 


GAC 


720 




Asp 


Arg 


Leu 


Gin 


Val 


Leu 


Asp 


Asp His 


Tyr 


Arg 


Asp 


Val 


Leu 


Lys 


Asp 




20 


225 
ATG 
Met 


AAG 
Lys 


GCC 
Ala 


AAG 
Lys 


GCG 
Ala 


230 
TCC 
Ser 


AC 








235 










240 


740 
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Y : 

Xaa 
Xad 
Xag 
Xaj 



245 

C or T R : A or G 

Glu or Asp Xab 

Gly or Asp xae 

Val or Ala Xah 

Ala or Thr Xak 



S : G or C 
Gin or Arg 
Arg or Lys 
Ser or Thr 
Pro or Ser 



W : A or T 
Xac : Lys or Glu 
Xaf : Thr or He 
Xai : Ser or Gly 



SEQ ID NO: 67 

SEQUENCE LENGTH: 515 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N29-1, N29-2, N29-3 



45 



AC TAC CGG GAC GTG CTG AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT 47 
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20 



Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1 5 10 15 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGY AAG CTG ACG CCC CCA 95 
5 Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

20 25 30 

CAC TCG GCC AGA TCT AAR TTT GGC TAC GGG GCA AAG GAC GTC CGG AGC 143 
His Ser Ala Arg Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ser 
10 35 40 45 

CTG TCC AGC AAG GCC GTT AAC CAC ATC CGC TCC GTG TGG ARG GAC TTG 191 
Leu Ser Ser Lys Ala Val Asn His lie Arg Ser Val Trp Xaa Asp Leu 
50 55 60 

75 CTG GAA GAC ACT GAR ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT 239 

Leu Glu Asp Thr Glu Thr Pro He Asp Thr Thr He Met Ala Lys Asn 

65 70 75 

GAG GTT TTC TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 287 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
80 85 90 95 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 335 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

100 105 110 

CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 383 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

115 120 125 

TAC GGA TTC CAG TAC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 431 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

130 135 140 

GCC TGG AAG TCA AAG AAG AGY CCT ATG GGC TTT KCA TAT GAC ACC CGC 479 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Xab Tyr Asp Thr Arg 

145 150 155 

TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT 515 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg 
160 165 170 

Y:CorT R : A or G K : G or T 

Xaa : Lys or Glu Xab : Ala or Ser 



25 



30 



35 



40 



45 



SEQ ID NO: 68 

SEQUENCE LENGTH: 401 base pairs 
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SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
5 ANTI -SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
W CLONE: N18-2, N18-3, N18-4, H18-1, H18-2, H18-3 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACR GTC ACY GAG 47 
Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
15 1 5 10 15 

ARY GAY ATC CGT RYT GAG GAG TCA ATY TAY CAA TGY TGT GAC TTG GHC 95 
Xaa Asp He Arg Xab Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Xac 

20 25 30 

CCC GAG GCC AGA CAG GCY ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC YTG ACY AAT TCA AAR GGG CAR AAC TGC GGY TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GYC AGC GGC GTG CTG ACG ACY AGC TGC GGT AAT ACY CTY ACA 239 
Cys Arg Xad Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCR AAG CTC CRG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Xae Asp 
80 85 90 95 

TGC ACR ATG CTC GTG TGC GGR GAC GAC CTT GTC GTY ATC TGT GAR AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGR ACC CAG GAG GAC GCG GCR ARC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Xaa Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
130 

45 Y:CorT R : A or G H : A or T or C 
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Xaa : Asn or Ser Xab : Thr or He or Val 

Xac : Asp or Val or Ala Xad s Ala or Val Xae : Gin or Arg 

SEQ ID NO: 69 
5 SEQUENCE LENGTH: 1171 base pairs 

SEQUENCE TYPE: nucleic acid 

STRAND EDNESS : double 

TOPOLOGY: linear 
W ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
75 CLONE: 03 0 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACR GTC ACT GAG 47 
Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
20 1 5 10 15 

AAT GAC ATC CGT GTY GAG GAG TCA ATT TAC CAA TGT TGT GAC TTG GCC 95 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCR CTC ACA GAG CGG CTT TAC ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAR GGG CAG AAC TGC GGY TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GYC AGC GGC GTG CTG ACG ACT AGC TGC GGY AAT ACC CTC ACA 239 
Cys Arg Xaa Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC. TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACG ATG CTT GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAW AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Xab Ser 

100 105 110 

GCG GGA ACT CAG GAG GAC GCG GCG AGC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala 

115 120 125 
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ATG ACT AGG TAC TCT GCC CCC CCC GGG GAC CCG CCC CAA CCA GAA TAC 431 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr 

130 135 140 

GAC TTG GAG CTG ATA ACA TCA TGY TCC TCC AAY GTG TCG GTC GCG CAC 479 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 

145 150 155 

GAC GCA TCA GGC AAA CGG GTG TAC TAY CTC ACC CGT GAC CCC MCC ACC 527 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Xac Thr 
160 165 170 175 

CCC CTW GCG CGG GCT GCG TGG GAG ACA GCT AGA CAC ACT CCA GTC AAC 575 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 
75 1 80 1 85 1 90 

TCC TGG CTA GGC AAC ATC ATC ATG TAY GCG CCC ACC TTA TGG GCA AGG 623 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

195 200 205 

ATG ATT CTG ATG ACC CAC TTC TTC TCC ATC CTT CTA GCC CAG GAG CAA 671 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 

210 215 220 

CTT GAA AAA GCC CTA GAT TGT CAG ATC TAY GGG GCC ACT TAC TCC ATT 719 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Thr Tyr Ser He 

225 230 235 

GAG CCA CTT GAC CTA CCT CAG ATC ATT CAA CGA CTC CAY GGT CTT AGC 767 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
240 245 250 255 

GCA TTT TCA CTC CAT AGT TAC TCT CCA GGT GAG ATC AAT AGG GTG GCT 815 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 

260 265 270 

TCA TGC CTC AGG AAA CTT GGG GTA CCG CCC TTG CGA GTC TGG AGA CAT 863 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

275 280 285 

CGG GCC AGA AGC GTC CGC GCT AAG CTA CTG TCC CAG GGG GGG AGG GCC 911 
Arg Ala Arg Ser Val Arg Ala Lys Leu Leu Ser Gin Gly Gly Arg Ala 

290 295 300 

GCC ACC TGT GGC AAA TAC CTC TTC AAC TGG GCA GTA AAG ACC AAG CTC 959 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 

305 310 315 

AAA CYC ACT CCA ATC CCR GAA GCG TCC CAG CTG GAC TTG TCC GGC TGG 1007 
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Lys Xad Thr Pro He Pro Glu Ala Ser Gin Leu Asp Leu Ser Gly Trp 
320 325 330 335 

TTC GTT GCT GGT TAC AGC GGG GGA GAC ATA TAT CAC AGC CTG TCT CGT 1055 
5 Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

340 345 350 

GCC CGA CCC CGC TGG TTY ATG TGG TGC CTA CTC CTA CTT TCC GTA GGG 1103 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 
10 355 360 365 

GTA GGC ATC TAC CTG CTC CCC AAC CGA TGA GCG GGG AGC TAA ACA CTC 1151 
Val Gly He Tyr Leu Leu Pro Asn Arg StopAla Gly Ser StopThr Leu 
370 375 380 

75 CAG GCC AAT AGG CCA TCC C C 1171 

Gin Ala Asn Arg Pro Ser 
385 

YsCorT R J A or G M : A or C W : A or T 

20 Xaa : Val or Ala Xab : Asp or Glu Xac .: Thr or Pro 

Xad : Leu or Pro 



SEQ ID NO: 70 

SEQUENCE LENGTH: 1084 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 2217 

GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
15 10 15 

ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC TAT GTG 95 
He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC AAC CTT 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Asn Leu 
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35 40 45 

ACC ATC ACT CAG CTG TTG AAG AGG CTT CAC CAG TGG ATT AAT GAG GAC 191 
Thr lie Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp 

50 55 60 

TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTA TTG GCT GAT TTC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

CTG CCG CGG TTA CCG GGG GTC CCT TTT TTC TCA TGC CAG CGT GGG TAC 335 
Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr 

100 105 HO 

AAG GGG GTT TGG CGG GGA GAT GGC ATC ATG TAT ACC ACC TGC CCA TGT 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Tyr Thr Thr Cys Pro Cys 
20 115 120 125 

GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG ATC GTT 431 
Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val 
130 135 140 

25 GGG CCT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTT CCC ATC AAC 479 

Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn 

145 150 155 

GCG TAC ACC ACA GGC CCC TGC ACA CCC TCC CCG GCG CCA AAC TAT TCC 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
160 165 170 175 

AGG GCG TTG TGG CGG GTG GCC ATT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Arg Ala Leu Trp Arg Val Ala He Glu Glu Tyr Val Glu Val Thr Arg 

180 185 190 

GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC GTG AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys 

195 200 . 205 

TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 
225 230 235 
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GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAT ACG GTT GGG TCA CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly Ser Gin 
240 245 250 255 

CTC CCA TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC 815 
Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu 

260 265 270 

ACC GAC CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG ACC 863 
Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr 

275 280 285 

AGA GGG TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG TTG TCT 911 
Arg Gly Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin Leu Ser 

290 295 300 

GCG CTT TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC CCA GAC 959 
Ala Leu Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala Pro Asp 

305 310 315 

ACT GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGA 1007 
Thr Asp Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly 
320 325 330 335 

AAC ATC ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC TCT 1055 
Asn He Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser 

340 345 350 

TTT GAA CCG CTT CGA GCG GAG GAG GAT GA 1084 
Phe Glu Pro Leu Arg Ala Glu Glu Asp 

355 360 

SEQ ID N0:71 

SEQUENCE LENGTH: 1004 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 1728 

TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC ACC GAC 48 
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Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 

1 5 10 15 

CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG ACC AGA GGG 96 
s Pro Ser His He Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr Arg Gly 

20 25 30 

TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG TTG TCT GCG CTT 144 
Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin Leu Ser Ala Leu 
10 35 40 45 

TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC CCA GAC ACT GAC 192 
Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala Pro Asp Thr Asp 
50 55 60 

75 CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGA AAC ATC 240 

Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 
65 70 75 80 

ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC TCT TTT GAA 288 
Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser Phe Glu 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT GAG AGG GAA GTG TCC GTT GCG GCG GAG 336 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Ala Ala Glu 

100 105 110 

ATC CTG CGG AAG ACC AGG AAA TTC CCC GCA GCG ATG CCC GTA TGG GCA 384 
He Leu Arg Lys Thr Arg Lys Phe Pro Ala Ala Met Pro Val Trp Ala 

115 120 125 

CGC CCG GAC TAG AAC CCA CCA TTA CTA GAG TCT TGG AAG AAC CCG GAC 432 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asn Pro Asp 

130 135 140 

TAC GTC CCT CCA GTG GTA CAC GGG TGC CCA TTG CCG CCT ACC AAG GCC 480 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Ala 
"5 150 155 160 

CCT CCA ATA CCA CCT CCA CGA AGA AAG AGA ACG GTT GTC CTG ACA GAA 528 
Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 

165 170 175 

TCC TCC GTG TCC TCT GCC TTG GCG GAG CTT GCT ACA AAG ACC TTT GGC 576 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 

180 185 190 

AGT TCC GGA TCG TCG GCC GTC GAC AGC GGC ACG GCG ACC GGC CCT CCT 624 
Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Gly Pro Pro 
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195 200 205 

GAC CAG GCC TCC GCC GAA GGA GAT GCA GGA TCC GAC GCT GAG TCG TAC 672 
Asp Gin Ala Ser Ala Glu Gly Asp Ala Gly Ser Asp Ala Glu Ser Tyr 

210 215 220 

TCC TCC ATG CCC CCC CTT GAG GGA GAG CCG GGG GAC CCC GAT CTC AGC 720 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
225 230 235 240 

GAC GGG TCT TGG TCT ACC GTA AGC GAG GAG GCC AGC GAG GAC GTC GTC 768 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

245 250 255 

TGC TGC TCG ATG TCC TAC ACA TGG ACA GGC GCC TTA ATT ACA CCA TGC 816 
Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

260 265 270 

GCC GCG GAG GAG AGC AAG CTG CCC ATT AAT GCG CTG AGC AAC CCT TTG 864 
Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Pro Leu 
20 275 280 285 

CTG CGC CAC CAC AAC ATG GTC TAT GCC ACA ACA TCC CGC AGC GCA AGC 912 
Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser 
290 295 300 

25 CAG CGG CAG AAA AAG GTC ACA TTT GAC AGA CTG CAA GTC CTG GAT GAC 960 

Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
305 310 315 320 

CAC TAC CGG GAC GTG CTC AAG GAC ATG AAG GCC AAG GCG TCC AC 1004 
His Tyr Arg Asp Val Leu Lys Asp Met Lys Ala Lys Ala Ser 

325 330 
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SEQ ID NO: 72 

SEQUENCE LENGTH: 857 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 2918 
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AC TAC CGG GAC GTG CTG AAG GAC ATG AAG GCC AAG GCG TCC ACA GTT 47 
Tyr Arg Asp Val Leu Lys Asp Met Lys Ala Lys Ala Ser Thr Val 
1 5 10 15 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG CTG ACG CCC CCA 95 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

20 25 30 

CAC TCG GCC AGA TCT AAA TTT GAC TAC GGG GCA AAG GAC GTC CAG AGC 143 
His Ser Ala Arg Ser Lys Phe Asp Tyr Gly Ala Lys Asp Val Gin Ser 

35 40 45 

CTG TCC AGC AAG GCC GTT AAC CAC ATC CAC TCC GTG TGG AAG GAC TTG 191 
Leu Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu 

50 55 60 

CCG GAA GAC ACT GAG ACA CCA ATC GAC ACC ACC ATC ATG GCA AAA AAT 239 
Pro Glu Asp Thr Glu Thr Pro He Asp Thr Thr He Met Ala Lys Asn 

65 70 75 

GAG GTT TTT TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 287 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
80 85 90 95 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 335 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

100 105 HO 

CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 383 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

115 120 125 

TAC AGA TTT CAG TGC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 431 
Tyr Arg Phe Gin Cys Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

130 135 140 

GCC TGG AAG TCA AAG AAG AGC CCT ATG GGC TTT GCA TAT GAC ACC CGC 479 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala Tyr Asp Thr Arg 

145 150 155 

TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT ACT GAG GAG TCA 527 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Thr Glu Glu Ser 
40 160 165 170 175 

ATT TAT CAA TGT TGT GAC TTG GAC CCC GAG GCC AGA CAG GCC ATA AGG 575 
He Tyr Gin Cys Cys Asp Leu Asp Pro Glu Ala Arg Gin Ala He Arg 

180 185 190 

45 TCG CTC ACA GAG CGG CTT TAT ATC GGG GGC CCC TTG ACC AAT TCA AAA 623 



25 



30 



35 



50 



55 



207 



75 



20 



EP 0 518 313 A2 



Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 

195 200 205 

GGG CAA AAC TGC GGC TAT CGC CGG TGC CGC GCC AGC GGC GTG CTG ACG 671 
5 Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 

210 215 220 

ACT AGC TGC GGT AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT GCA GCC 719 
Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
w 225 230 235 

TGT CGA GCT GCG AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC GGA GAC 767 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp 
240 245 250 255 

GAC CTT GTC GTT ATC TGT GAA AGC GCG GGA ACC CAG GAG GAC GCG GCA 815 
Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 

260 265 270 

AAC CTA CGA GTC TTC ACG GAG GCT ATG ACC AGG AAT TCC GCC 857 
Asn Leu Arg Val Phe Thr Glu Ala Met Thr Arg Asn Ser Ala 

275 280 285 

SEQ ID NO: 73 

SEQUENCE LENGTH: 1818 base pairs 
SEQUENCE TYPE: nucleic acid 
STRAND EDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 1718 

35 

TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC ACC GAC 48 
Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 
1 5 10 15 

40 CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG ACC AGA GGG 96 

Pro Ser His lie Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr Arg Gly 

20 25 30 

TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG TTG TCT GCG CTT 144 
45 Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin Leu Ser Ala Leu 
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35 40 45 

TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC CCA GAC ACT GAC 192 
Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala Pro Asp Thr Asp 

50 55 60 

CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGA AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
65 70 75 80 

ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC TCT TTT GAA 288 
Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser Phe Glu 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT GAG AGG GAA GTG TCC GTT GCG GCG GAG 336 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Ala Ala Glu 

100 105 HO 

ATC CTG CGG AAG ACC AGG AAA TTC CCC GCA GCG ATG CCC GTA TGG GCA 384 
He Leu Arg Lys Thr Arg Lys Phe Pro Ala Ala Met Pro Val Trp Ala 

115 120 125 

CGC CCG GAC TAG AAC CCA CCA TTA CTA GAG TCT TGG AAG AAC CCG GAC 432 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asn Pro Asp 

130 135 140 

TAC GTC CCT CCA GTG GTA CAC GGG TGC CCA TTG CCG CCT ACC AAG GCC 480 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Ala 
i45 150 155 160 

CCT CCA ATA CCA CCT CCA CGA AGA AAG AGA ACG GTT GTC CTG ACA GAA 528 
Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 

165 170 175 

TCC TCC GTG TCC TCT GCC TTG GCG GAG CTT GCT ACA AAG ACC TTT GGC 576 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 

180 185 190 

AGT TCC GGA TCG TCG GCC GTC GAC AGC GGC ACG GCG ACC GGC CCT CCT 624 
Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Gly Pro Pro 

195 200 205 

GAC CAG GCC TCC GCC GAA GGA GAT GCA GGA TCC GAC GCT GAG TCG TAC 672 
Asp Gin Ala Ser Ala Glu Gly Asp Ala Gly Ser Asp Ala Glu Ser Tyr 

210 215 220 

TCC TCC ATG CCC CCC CTT GAG GGA GAG CCG GGG GAC CCC GAT CTC AGC 720 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
225 230 235 240 
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GAC GGG TCT TGG TCT ACC GTA AGC GAG GAG GCC AGC GAG GAC GTC GTC 768 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

245 250 255 

5 TGC TGC TCG ATG TCC TAG ACA TGG ACA GGC GCC TTA ATT ACA CCA TGC 816 

Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

260 265 270 

GCC GCG GAG GAG AGC AAG CTG CCC ATT AAT GCG CTG AGC AAC CCT TTG 864 
w Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Pro Leu 

275 280 285 

CTG CGC CAC CAC AAC ATG GTC TAT GCC ACA ACA TCC CGC AGC GCA AGC 912 
Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser 

290 295 300 

CAG CGG CAG AAA AAG GTC ACA TTT GAC AGA CTG CAA GTC CTG GAT GAC 960 
Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
305 310 315 320 

CAC TAC CGG GAC GTG CTS AAG GAC ATG AAG GCC AAG GCG TCC ACA GTT 1008 
His Tyr Arg Asp Val Xaa Lys Asp Met Lys Ala Lys Ala Ser Thr Val 

325 330 335 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG CTG ACG CCC CCA 1056 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

340 345 350 

CAC TCG GCC AGA TCT AAA TTT GAC TAC GGG GCA AAG GAC GTC CAG AGC 1104 
His Ser Ala Arg Ser Lys Phe Asp Tyr Gly Ala Lys Asp Val Gin Ser 

355 360 365 

CTG TCC AGC AAG GCC GTT AAC CAC ATC CAC TCC GTG TGG AAG GAC TTG 1152 
Leu Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu 

370 375 380 

CCG GAA GAC ACT GAG ACA CCA ATC GAC ACC ACC ATC ATG GCA AAA AAT 1200 
Pro Glu Asp Thr Glu Thr Pro He Asp Thr Thr He Met Ala Lys Asn 
385 390 395 400 

GAG GTT TTT TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 1248 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 

405 410 415 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 1296 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

420 425 430 

45 CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 1344 
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Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

435 440 445 

TAC AGA TTT CAG TGC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 1392 
5 Tyr Arg Phe Gin Cys Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

450 455 460 

GCC TGG AAG TCA AAG AAG AGC CCT ATG GGC TTT GCA TAT GAC ACC CGC 1440 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala Tyr Asp Thr Arg 
10 465 470 475 480 

TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT ACT GAG GAG TCA 1488 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Thr Glu Glu Ser 

485 490 495 

75 ATT TAT CAA TGT TGT GAC TTG GAC CCC GAG GCC AGA CAG GCC ATA AGG 1536 

He Tyr Gin Cys Cys Asp Leu Asp Pro Glu Ala Arg Gin Ala He Arg 

500 505 510 

TCG CTC ACA GAG CGG CTT TAT ATC GGG GGC CCC TTG ACC AAT TCA AAA 1584 
Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 

515 520 525 

GGG CAA AAC TGC GGC TAT CGC CGG TGC CGC GCC AGC GGC GTG CTG ACG 1632 
Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 

530 535 540 

ACT AGC TGC GGT AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT GCA GCC 1680 
Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
54 5 550 555 560 

TGT CGA GCT GCG AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC GGA GAC 1728 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp 

565 570 575 

GAC CTT GTC GTT ATC TGT GAA AGC GCG GGA ACC CAG GAG GAC GCG GCA 1776 
Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 

580 585 590 

AAC CTA CGA GTC TTC ACG GAG GCT ATG ACC AGG AAT TCC GCC 1818 
Asn Leu Arg Val Phe Thr Glu Ala Met Thr Arg Asn Ser Ala 
595 600 605 
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SEQUENCE LENGTH: 2591 base pairs 
SEQUENCE TYPE: nucleic acid 
45 STRANDEDNESS : double 



50 



55 



211 



r. 



10 



15 



20 



25 



30 



45 



50 



EP 0 518 313 A2 



TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 2218 



GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
15 10 15 

ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC TAT GTG 95 
lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC AAC CTT 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser Asn Leu 

35 40 45 

ACC ATC ACT CAG CTG TTG AAG AGG CTT CAC CAG TGG ATT AAT GAG GAC 191 
Thr lie Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp 

50 55 60 

TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTA TTG GCT GAT TTC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Glh Ser Lys Leu 
80 85 90 95 

CTG CCG CGG TTA CCG GGG GTC CCT TTT TTC TCA TGC CAG CGT GGG TAC 335 
Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr 
35 100 105 110 

AAG GGG GTT TGG CGG GGA GAT GGC ATC ATG TAT ACC ACC TGC CCA TGT 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Tyr Thr Thr Cys Pro Cys 

115 120 125 

40 GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG ATC GTT 431 

Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val 

130 135 140 

GGG CCT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTT CCC ATC AAC 479 
Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn 
145 150 155 
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GCG TAC ACC ACA GGC CCC TGC ACA CCC TCC CCG GCG CCA AAC TAT TCC 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
160 165 170 175 

AGG GCG TTG TGG CGG GTG GCC ATT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Arg Ala Leu Trp Arg Val Ala lie Glu Glu Tyr Val Glu Val Thr Arg 

180 185 190 

GTG GGG GAT TTC CAC TAG GTG ACG GGC ATG ACC ACT GAC AAC GTG AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys 

195 200 205 

TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 

225 230 235 

GAG GTC ACA TTC CAG GTC GGG CTC AAC GAA TAT ACG GTT GGG TCA CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly Ser Gin 
240 245 250 255 

CTC CCA TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC 815 
Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu 

260 265 270 

ACC GAC CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG ACC 863 
Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr 

275 280 285 

AGA GGG TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG TTG TCT 911 
Arg Gly Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin Leu Ser 

290 295 300 

GCG CTT TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC CCA GAC 959 
Ala Leu Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala Pro Asp 

305 310 315 

ACT GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGA 1007 
Thr Asp Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly 
320 325 330 335 

AAC ATC ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC TCT 1055 
Asn He Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser 

340 345 350 

45 TTT GAA CCG CTT CGA GCG GAG GAG GAT GAG AGG GAA GTG TCC GTT GCG 1103 
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Phe Glu Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Ala 

355 360 365 

GCG GAG ATC CTG CGG AAG ACC AGG AAA TTC CCC GCA GCG ATG CCC GTA 1151 
Ala Glu lie Leu Arg Lys Thr Arg Lys Phe Pro Ala Ala Met Pro Val 

370 375 380 

TGG GCA CGC CCG GAC TAC AAC CCA CCA MA CTA GAG TCT TGG AAG AAC 1199 
Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asn 

385 390 395 

CCG GAC TAC GTC CCT CCA GTG GTA CAC GGG TGC CCA TTG CCG CCT ACC 1247 
Pro Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr 
400 405 410 415 

AAG GCC CCT CCA ATA CCA CCT CCA CGA AGA AAG AGA ACG GTT GTC CTG 1295 
Lys Ala Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu 

420 425 430 

ACA GAA TCC TCC GTG TCC TCT GCC TTG GCG GAG CTT GCT ACA AAG ACC 1343 
Thr Glu Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr 

435 440 445 

TTT GGC AGT TCC GGA TCG TCG GCC GTC GAC AGC GGC ACG GCG ACC GGC 1391 
Phe Gly Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Gly 

450 455 460 

CCT CCT GAC CAG GCC TCC GCC GAA GGA GAT GCA GGA TCC GAC GCT GAG 1439 
Pro Pro Asp Gin Ala Ser Ala Glu Gly Asp Ala Gly Ser Asp Ala Glu 

465 470 475 

TCG TAC TCC TCC ATG CCC CCC CTT GAG GGA GAG CCG GGG GAC CCC GAT 1487 
Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp 
480 485 490 495 

CTC AGC GAC GGG TCT TGG TCT ACC GTA AGC GAG GAG GCC AGC GAG GAC 1535 
Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp 

500 505 510 

GTC GTC TGC TGC TCG ATG TCC TAC ACA TGG ACA GGC GCC TTA ATT ACA 1583 
Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr 

515 520 525 

CCA TGC GCC GCG GAG GAG AGC AAG CTG CCC ATT AAT GCG CTG AGC AAC 1631 
Pro Cys Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn 

530 535 540 

CCT TTG CTG CGC CAC CAC AAC ATG GTC TAT GCC ACA ACA TCC CGC AGC 1679 
45 Pro Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser 
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545 550 555 

GCA AGC CAG CGG CAG AAA AAG GTC ACA TTT GAC AGA CTG CAA GTC CTG 1727 
Ala Ser Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu 
560 565 570 575 

GAT GAC CAC TAC CGG GAC GTG CTS AAG GAC ATG AAG GCC AAG GCG TCC 1775 
Asp Asp His Tyr Arg Asp Val Xaa Lys Asp Met Lys Ala Lys Ala Ser 

580 585 590 

ACA GTT AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG CTG ACG 1823 
Thr Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr 

595 600 605 

CCC CCA CAC TCG GCC AGA TCT AAA TTT GAC TAC GGG GCA AAG GAC GTC 1871 
Pro Pro His Ser Ala Arg Ser Lys Phe Asp Tyr Gly Ala Lys Asp Val 

610 615 620 

CAG AGC CTG TCC AGC AAG GCC GTT AAC CAC ATC CAC TCC GTG TGG AAG 1919 
Gin Ser Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys 

625 630 635 

GAC TTG CCG GAA GAC ACT GAG ACA CCA ATC GAC ACC ACC ATC ATG GCA 1967 
Asp Leu Pro Glu Asp Thr Glu Thr Pro lie Asp Thr Thr He Met Ala 
640 645 650 655 

AAA AAT GAG GTT TTT TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA 2015 
Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro 

660 665 670 

GCT CGC CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA 2063 
Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys 

675 680 685 

ATG GCC CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC 2111 
Met Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly 

690 695 700 

TCC TCA TAC AGA TTT CAG TGC TCC CCT GGA CAG CGG GTC GAG TTC CTG 2159 
Ser Ser Tyr Arg Phe Gin Cys Ser Pro Gly Gin Arg Val Glu Phe Leu 
705 710 715 

• GTG AAT GCC TGG AAG TCA AAG AAG AGC CCT ATG GGC TTT GCA TAT GAC 2207 
Val Asn Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala Tyr Asp 
720 725 730 735 

ACC CGC TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT ACT GAG 2255 
Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Thr Glu 

740 745 750 
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GAG TCA ATT TAT CAA TGT TGT GAC TTG GAC CCC GAG GCC AGA CAG GCC 2303 
Glu Ser He Tyr Gin Cys Cys Asp Leu Asp Pro Glu Ala Arg Gin Ala 

755 760 765 

ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC GGG GGC CCC TTG ACC AAT 2351 
He Arg Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn 

770 775 780 

TCA AAA GGG CAA AAC TGC GGC TAT CGC CGG TGC CGC GCC AGC GGC GTG 2399 
Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val 

785 790 795 

CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT 2447 
Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser 
800 805 810 815 

GCA GCC TGT CGA GCT GCG AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC 2495 
Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Cys 

820 825 830 

GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC GCG GGA ACC CAG GAG GAC 2543 
Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp 

835 840 845 

GCG GCA AAC CTA CGA GTC TTC ACG GAG GCT ATG ACC AGG AAT TCC GCC 2591 
Ala Ala Asn Leu Arg Val Phe Thr Glu Ala Met Thr Arg Asn Ser Ala 
850 855 860 

SEQ ID NO: 75 

SEQUENCE LENGTH: 4296 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 1530U 



GCGGATCCT CCA CCT CCA TCG TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG 51 
45 Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg 

15 10 
CTG AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA 99 
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Leu Lys 

15 
GCC GTT 
Ala Val 

ATG GCA 
Met Ala 

CTG GTA 
Leu Val 

GGC AGC 
Gly Ser 
80 

GTT ATT 
Val He 

95 
GAG TGC 
Glu Cys 

GAG CAA 
Glu Gin 

CAA GCG 
Gin Ala 

GAG ACC 
Glu Thr 
160 
TAC TTA 
Tyr Leu 
175 

CTG ATG 
Leu Met 

ACC CTC 
Thr Leu 



Pro Thr Leu 

CAG AAC GAG 
Gin Asn Glu 
35 

TGC ATG TCG 
Cys Met Ser 
50 

GGC GGG GTC 
Gly Gly Val 
65 

GTG GTC ATT 
Val Val lie 

CCC GAC AGG 
Pro Asp Arg 



His Gly 

20 
GTT ACC 
Val Thr 

GCT GAC 
Ala Asp 

CTC GCG 
Leu Ala 



GCC TCG CAC 
Ala Ser His 
115 

TTC AAG CAG 
Phe Lys Gin 

130 
GAG GCT GCT 
Glu Ala Ala 
145 

TTC TGG GCG 
Phe Trp Ala 



GTG 
Val 

GAA 
Glu 
100 
CTC 
Leu 



GGC 
Gly 
85 
GTT 
Val 

CCT 
Pro 



Pro Thr Pro 

CTC ACA CAC 
Leu Thr His 
40 

CTA GAG GTC 
Leu Glu Val 
55 

GCT CTG GCC 
Ala Leu Ala 
70 

AGG ATC ATC 
Arg He He 



Leu Leu Tyr Arg Leu 
25 

CCC ATA ACC 
Pro He Thr 



AAG GCG 
Lys Ala 

GCT CCC 
Ala Pro 



GCA GGC TTG 
Ala Gly Leu 

GCA TTC ACA 
Ala Phe Thr 
195 

CTG TTT AAC 

Leu Phe Asn 



AAG 
Lys 

TCC 
Ser 
180 
GCC 
Ala 



CAC 
His 
165 
ACT 
Thr 

TCT 
Ser 



ATC TTG 
He Leu 



CTC TAC CAA 
Leu Tyr Gin 

TAC ATC GAA 
Tyr He Glu 
120 

CTC GGT TTG 
Leu Gly Leu 

135 
GTG GTG GAG 
Val Val Glu 
150 

ATG TGG AAT 
Met Trp Asn 

CTG CCT GGA 
Leu Pro Gly 

ATC ACC AGC 
He Thr Ser 
200 

GGG GGA TGG 
Gly Gly Trp 



GTC ACT AGC 
Val Thr Ser 

GCG TAC TGC 
Ala Tyr Cys 
75 

TTG TCC GGG 
Leu Ser Gly 
90 

GAG TTC GAT 
Glu Phe Asp 
105 

CAA GGA ATG 
Gin Gly Met 

CTG CAA ACA 
Leu Gin Thr 



AAG TTC 
Lys Phe 
45 

ACT TGG 
Thr Trp 

60 
CTG ACA 
Leu Thr 



Gly 
30 
ATC 
He 

GTG 
Val 

ACG 
Thr 



AGG CCG GCC 
Arg Pro Ala 



TCC AAG 
Ser Lys 

TTC ATC 
Phe He 
170 
AAC CCC 
Asn Pro 
185 

CCG CTC 
Pro Leu 



TGG 
Trp 
155 
AGC 
Ser 

GCA 
Ala 

ACC 
Thr 



GAA ATG 
Glu Met 

CAG CTC 
Gin Leu 
125 
GCC ACC 
Ala Thr 
140 

CGA GCC 
Arg Ala 



GAA 
Glu 
110 
GCC 
Ala 

AAG 
Lys 

CTT 
Leu 



GTG GCC GCC 
Val Ala Ala 



GGG ATA CAG 
Gly He Gin 

ATA GCA TCA 
He Ala Ser 
190 

ACC CAA TAT 
Thr Gin Tyr 

205 
CAA CTC GCC 
Gin Leu Ala 
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210 215 220 

CCC CCC AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG 723 
Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly lie Ala Gly Ala 

225 230 235 

GCT GTT GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG 771 
Ala Val Gly Ser lie Gly Leu Gly Lys Val Leu Val Asp He Leu Ala 

240 245 250 

GGT TAT GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG 819 
Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met 
255 260 265 270 

AGC GGT GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC 867 
Ser Gly Asp Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 

275 280 285 

ATC CTC TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA 915 
He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie 

290 295 300 

CTG CGT CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC 963 
Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn 

305 310 315 

CGG CTG ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC 1011 
Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His 

320 325 330 

TAT GTG CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC 1059 
Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser 
335 340 345 350 

AAC CTT ACC ATC ACT CAG CTG TTG AAG AGG CTT CAC CAG TGG ATT AAT 1107 
Asn Leu Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn 

355 360 365 

GAG GAC TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG 1155 
Glu Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp 

370 375 380 

GAC TGG ATA TGC ACG GTA TTG GCT GAT TTC AAG ACC TGG CTC CAG TCC 1203 
Asp Trp He Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gin Ser 

385 390 395 

AAG CTC CTG CCG CGG TTA CCG GGG GTC CCT TTT TTC TCA TGC CAG CGT 1251 
Lys Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg 
400 405 410 



218 



EP 0 518 313 A2 



70 



15 



20 



GGG TAC AAG GGG GTT TGG CGG GGA GAT GGC ATC ATG TAT ACC ACC TGC 1299 
Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie Met Tyr Thr Thr Cys 
415 420 425 430 

CCA TGT GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG 1347 
Pro Cys Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg 

435 440 445 

ATC GTT GGG CCT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTT CCC 1395 
He Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro 

450 455 460 

ATC AAC GCG TAC ACC ACA GGC CCC TGC ACA CCC TCC CCG GCG CCA AAC 1443 
He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn 

465 470 475 

TAT TCC AGG GCG TTG TGG CGG GTG GCC ATT GAG GAG TAT GTG GAG GTC 1491 
Tyr Ser Arg Ala Leu Trp Arg Val Ala He Glu Glu Tyr Val Glu Val 

480 485 490 

ACG CGG GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC 1539 
Thr Arg Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn 
495 500 505 510 

GTG AAA TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG 1587 
Val Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu 

515 520 525 

GAT GGG GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG 1635 
Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu 

530 535 540 

CGG GAT GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAT ACG GTT GGG 1683 
Arg Asp Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly 

545 550 555 

TCA CAG CTC CCA TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC 1731 
Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser 

560 565 570 

ATG CTC ACC GAC CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG 1779 
Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Arg Arg Arg 
40 5 75 580 585 590 

CTG ACC AGA GGG TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG 1827 
Leu Thr Arg Gly Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin 

595 600 605 

45 TTG TCT GCG CTT TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC 1875 
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Leu Ser Ala Leu Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala 

610 615 620 

CCA GAC ACT GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG 1923 
Pro Asp Thr Asp Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met 

625 630 635 

GGC GGA AAC ATC ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA 1971 
Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys lie Val He Leu 

640 645 650 

GAC TCT TTT GAA CCG CTT CGA GCG GAG GAG GAT GAG AGG GAA GTG TCC 2019 
Asp Ser Phe Glu Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser 
655 660 665 670 

15 GTT GCG GCG GAG ATC CTG CGG AAG ACC AGG AAA TTC CCC GCA GCG ATG 2067 

Val Ala Ala Glu He Leu Arg Lys Thr Arg Lys Phe Pro Ala Ala Met 

675 680 685 

CCC GTA TGG GCA CGC CCG GAC TAC AAC CCA CCA TTA CTA GAG TCT TGG 2115 
Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp 

690 695 700 

AAG AAC CCG GAC TAC GTC CCT CCA GTG GTA CAC GGG TGC CCA TTG CCG 2163 
Lys Asn Pro Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro 

705 710 715 

CCT ACC AAG GCC CCT CCA ATA CCA CCT CCA CGA AGA AAG AGA ACG GTT 2211 
Pro Thr Lys Ala Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val 

720 725 730 

GTC CTG ACA GAA TCC TCC GTG TCC TCT GCC TTG GCG GAG CTT GCT ACA 2259 
Val Leu Thr Glu Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr 
735 740 745 750 

AAG ACC TTT GGC AGT TCC GGA TCG TCG GCC GTC GAC AGC GGC ACG GCG 2307 
Lys Thr Phe Gly Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala 

755 760 765 

ACC GGC CCT CCT GAC CAG GCC TCC GCC GAA GGA GAT GCA GGA TCC GAC 2355 
Thr Gly Pro Pro Asp Gin Ala Ser Ala Glu Gly Asp Ala Gly Ser Asp 

770 775 780 

40 GCT GAG TCG TAC TCC TCC ATG CCC CCC CTT GAG GGA GAG CCG GGG GAC 2403 

Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp 

785 790 795 

CCC GAT CTC AGC GAC GGG TCT TGG TCT ACC GTA AGC GAG GAG GCC AGC 2451 
Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser 
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800 805 810 

GAG GAC GTC GTC TGC TGC TCG ATG TCC TAC ACA TGG ACA GGC GCC TTA 2499 
Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu 
5 815 820 825 830 

ATT ACA CCA TGC GCC GCG GAG GAG AGC AAG CTG CCC ATT AAT GCG CTG 2547 
lie Thr Pro Cys Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu 

835 840 845 

70 AGC AAC CCT TTG CTG CGC CAC CAC AAC ATG GTC TAT GCC ACA ACA TCC 2595 

Ser Asn Pro Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser 

850 855 860 

CGC AGC GCA AGC CAG CGG CAG AAA AAG GTC ACA TTT GAC AGA CTG CAA 2643 
Arg Ser Ala Ser Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin 

865 870 875 

GTC CTG GAT GAC CAC TAC CGG GAC GTC CTG AAG GAC ATG AAG GCC AAG 2691 
Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Asp Met Lys Ala Lys 

880 885 890 

GCG TCC ACA GTT AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG 2739 
Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys 
895 900 905 910 

CTG ACG CCC CCA CAC TCG GCC AGA TCT AAA TTT GAC TAC GGG GCA AAG 2787 
Leu Thr Pro Pro His Ser Ala Arg Ser Lys Phe Asp Tyr Gly Ala Lys 

915 920 925 

GAC GTC CAG AGC CTG TCC AGC AAG GCC GTT AAC CAC ATC CAC TCC GTG 2835 
Asp Val Gin Ser Leu Ser Ser Lys Ala Val Asn His He His Ser Val 

930 935 940 

TGG AAG GAC TTG CCG GAA GAC ACT GAG ACA CCA ATC GAC ACC ACC ATC 2883 
Trp Lys Asp Leu Pro Glu Asp Thr Glu Thr Pro He Asp Thr Thr He 

945 950 955 

ATG GCA AAA AAT GAG GTT TTT TGT GTT CAA CCA GAG AAA GGA GGC CGC 2931 
Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg 

960 965 970 

AAG CCA GCT CGC CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC 2979 
Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys 
975 980 985 990 

GAG AAA ATG GCC CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG 3027 
Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val 
45 995 1000 1005 
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20 



ATG GGC TCC TCA TAC AGA TTT CAG TGC TCC CCT GGA CAG CGG GTC GAG 3075 
Met Gly Ser Ser Tyr Arg Phe Gin Cys Ser Pro Gly Gin Arg Val Glu 

1010 1015 1020 

5 TTC CTG GTG AAT GCC TGG AAG TCA AAG AAG AGC CCT ATG GGC TTT GCA 3123 

Phe Leu Val Asn Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala 

1025 1030 1035 

TAT GAC ACC CGC TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT 3171 
10 Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg 

1040 1045 1050 

ACT GAG GAG TCA ATT TAT CAA TGT TGT GAC TTG GAC CCC GAG GCC AGA 3219 
Thr Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Asp Pro Glu Ala Arg 
1055 1060 1065 1070 

CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC GGG GGC CCC CTG 3267 
Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu 

1075 1080 1085 

ACT AAT TCA AAG GGG CAG AAC TGC GGT TAT CGC CGG TGC CGC GTC AGC 3315 
Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Val Ser 

1090 1095 1100 

GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA TGT TAC TTG AAG 3363 
Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys 

H05 mo ins 

GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC TGC ACG ATG CTT 3411 
Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu 

1120 1125 1130 

GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAT AGC GCG GGA ACT CAG 3459 
Val Cys Gly Asp Asp Leu Val Val He Cys Asp Ser Ala Gly Thr Gin 
1135 1140 1145 1150 

GAG GAC GCG GCG AGC CTA CGA GTC TTC ACG GAG GCT ATG ACT AGG TAC 3507 
Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr 

1155 H60 1165 

TCT GCC CCC CCC GGG GAC CCG CCC CAA CCA GAA TAC GAC TTG GAG CTG 3555 
Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu 
40 1170 1175 1180 

ATA ACA TCA TGT TCC TCC AAT GTG TCG GTC GCG CAC GAC GCA TCA GGC 3603 
He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly 
1185 1190 1195 

45 AAA CGG GTG TAC TAT CTC ACC CGT GAC CCC ACC ACC CCC CTA GCG CGG 3651 
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Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg 

1200 1205 1210 

GCT GCG TGG GAG ACA GCT AGA CAC ACT CCA GTC AAC TCC TGG CTA GGC 3699 
Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly 
1215 1220 1225 1230 

AAC ATC ATC ATG TAC GCG CCC ACC TTA TGG GCA AGG ATG ATT CTG ATG 3747 
Asn lie lie Met Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met 

1235 1240 1245 

ACC CAC TTC TTC TCC ATC CTT CTA GCC CAG GAG CAA CTT GAA AAA GCC 3795 
Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala 

1250 1255 1260 

CTA GAT TGT CAG ATC TAC GGG GCC ACT TAC TCC ATT GAG CCA CTT GAC 3843 
Leu Asp Cys Gin He Tyr Gly Ala Thr Tyr Ser He Glu Pro Leu Asp 

1265 1270 1275 

CTA CCT CAG ATC ATT CAA CGA CTC CAC GGT CTT AGC GCA TTT TCA CTC 3891 
Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu 

1280 1285 1290 

CAT AGT TAC TCT CCA GGT GAG ATC AAT AGG GTG GCT TCA TGC CTC AGG 3939 
His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg 
1295 1300 1305 1310 

AAA CTT GGG GTA CCG CCC TTG CGA GTC TGG AGA CAT CGG GCC AGA AGC 3987 
Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser 

1315 1320 1325 

GTC CGC GCT AAG CTA CTG TCC CAG GGG GGG AGG GCC GCC ACC TGT GGC 4035 
Val Arg Ala Lys Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly 

1330 1335 1340 

AAA TAC CTC TTC AAC TGG GCA GTA AAG ACC AAG CTC AAA CTC ACT CCA 4083 
Lys Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro 

1345 1350 1355 

ATC CCA GAA GCG TCC CAG CTG GAC TTG TCC GGC TGG TTC GTT GCT GGT 4131 
He Pro Glu Ala Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly 

1360 1365 1370 

TAC AGC GGG GGA GAC ATA TAT CAC AGC CTG TCT CGT GCC CGA CCC CGC 4179 
Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg 
13? 5 1380 1385 1390 

TGG TTC ATG TGG TGC CTA CTC CTA CTT TCC GTA GGG GTA GGC ATC TAC 4227 
Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr 
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; , 1395 1400, w " 1405 ;\ 

CTG CTfc CCC AAC CGA TGA GCGGG GAGCTAAACA CTCCAGGCCA ATAGGCCATC 4280 ( 
Leu Leu Pro Asn Arg Stop 
5 1410 

CGCCTTTTTT TTTTTT 4296 

SEQ ID NO: 76 
70 SEQUENCE LENGTH: 818 base pairs 

SEQUENCE TYPE: nucleic acid 

STRAND EONESS : double 

TOPOLOGY: linear 
15 ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N22-1 

20 

GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
1 5 10 15 

ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC TAT GTG 95 
lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC AAC CTT 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Asn Leu 

35 40 45 

ACC ATC ACT CAG CTG TTG AAG AGG CTT CAC CAG TGG ATT AAT GAG GAC 191 
Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp 

50 55 60 

TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTA TTG GCT GAT TGC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Ala Asp Cys Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

CTG CCG CGG TTA CCG GGG GTC CCT TTT TTC TCA TGC CAG CGT GGG TAC 335 
45 Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr 
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15 



100 105 110 

AAG GGG GTT TGG CGG GGA GAT GGC ATC ATG TAT ACC ACC TGC CCA TGT 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Tyr Thr Thr Cys Pro Cys 

115 120 125 

GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG ATC GTT 431 
Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val 

130 135 140 

GGG CCT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTT CCC ATC AAC 479 
Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn 

145 150 155 

GCG TAG ACC ACA GGC CCC TGC ACA CCC TCC CCG GCG CCA AAC TAT TCC 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
160 165 170 175 

AGG GCG TTG TGG CGG GTG GCC ATT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Arg Ala Leu Trp Arg Val Ala He Glu Glu Tyr Val Glu Val Thr Arg 
20 180 185 190 

GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC GTG AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys 

195 200 205 

25 TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG GAT GGG 671 

Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 

225 230 235 

GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAT ACG GTT GGG TCA CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly Ser Gin 
240 245 250 255 

CTC CCA TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC 815 
Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu 

260 265 270 

ACC 
Thr 

SEQ ID NO: 77 

SEQUENCE LENGTH: 818 base pairs 
45 SEQUENCE TYPE: nucleic acid 
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STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N22-3 



GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
15 10 15 

r5 ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC TAT GTG 95 

He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC AAC CTT 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Asn Leu 

35 40 45 

ACC ATC ACT CAG TTG TTG AAG AGG CTC CAC CAG TGG ATT AAT GAG GAC 191 
Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp 

50 55 60 

TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTA TTG GCT GAT TTC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

CTG CCG CGG TTA CCG GGG GTC CCT TTC TTC TCA TGC CAG CGT GGG TAC 335 
Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr 

100 105 110 

AAG GGG GTT TGG CGG GGA GAC GGC ATC ATG TAT ACC ACC TGC CCA TGT 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Tyr Thr Thr Cys Pro Cys 

115 120 125 

GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG ATC GTT 431 
Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val 

130 135 140 

GGG CTT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTC CCC ATC AAC 479 
Gly Leu Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn 
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15 



145 150 155 

GCG TAC ACC ACA GGC CCC TGC ACA CCC TCT CCA GCG CCG AAC TAC TCC 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
W 165 170 175 

AGG GCG TTA TGG CGG GTA GCC GCT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Arg Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg 

180 185 190 

GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC GTA AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys 

195 200 205 

TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTG CGG CTG CGC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG CGG GAT 719 
Val Arg Leu Arg Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 

225 230 235 

GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAT ACG GTT GGG TCA CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly Ser Gin 
240 245 250 255 

25 CTC CCA TGT GAG CCC GAA CCG GAT GTA ACG GTG GTC ACC TCC ATG CTC 815 

Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu 

260 265 270 

ACC 
Thr 
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SEQ ID NO: 78 

SEQUENCE LENGTH: 818 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: H22-3 

GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
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His Val 
1 

ATA GCG TTC 
He Ala Phe 

CCT GAG AGC 
Pro Glu Ser 



ACC 
Thr 

TGC 
Cys 

ATA 
He 
80 
CTG 
Leu 



ATC 
He 

TCC 
Ser 
65 
TGC 
Cys 



ACT 
Thr 
50 
ACG 
Thr 

ACG 
Thr 



CCG CGG 
Pro Arg 



45 



AAG GGA GTC 
Lys Gly Val 

GGA GCA CAA 
Gly Ala Gin 
130 

GGC CCC AGA 
Gly Pro Arg 

145 
GCG TAC ACC 
Ala Tyr Thr 
160 

AGG GCG TTA 
Arg Ala Leu 

GTG GGG GAC 
Val Gly Asp 



Gly Pro Gly Glu 
5 

GCT TCG CGG GGT 
Ala Ser Arg Gly 
20 

GAC GCC GCA GCG 
Asp Ala Ala Ala 
35 

CAG CTG CTG AAG 
Gin Leu Leu Lys 

CCA TGT TCT GGT 
Pro Cys Ser Gly 

70 

GTG TTG AGT GAC 
Val Leu Ser Asp 
85 

CTA CCG GGA GTC 
Leu Pro Gly Val 
100 

TGG CGG GGA GAT 
Trp Arg Gly Asp 
115 

ATC GCC GGA CAT 
lie Ala Gly His 

ACC TGT AGC AAC 
Thr Cys Ser Asn 

150 

ACA GGC CCC TGC 
Thr Gly Pro Cys 
165 

TGG CGG GTA GCT 
Trp Arg Val Ala 
180 

TTC CAC TAC GTG 
Phe His Tyr Val 



Gly Ala Val Gin 

10 

GTC TCC 
Val Ser 
25 

ACC CAG 
Thr Gin 



AAC CAC 
Asn His 



CGT GTC 
Arg Val 
40 

AGG CTC 
Arg Leu 

55 
TCG TGG 
Ser Trp 

TTC AAG 
Phe Lys 

CCT TTC 
Pro Phe 

GGC ATC 
Gly He 
120 
GTC AAA 
Val Lys 
135 

ACG TGG 
Thr Trp 

ACA CCC 
Thr Pro 

GCT GAG 
Ala Glu 

ACG GGC 
Thr Gly 



Trp Met Asn Arg Leu 

15 

TAT GTG 
Tyr Val 

30 
AGC CTT 
Ser Leu 



CCC ACG CAT 
Pro Thr His 



CAC CAG 
His Gin 

CTC AGG 
Leu Arg 

ACC TGG 
Thr Trp 
90 

CTC TCA 
Leu Ser 
105 

ATG CAG 
Met Gin 

AAT GGT 
Asn Gly 

CAC GGA 
His Gly 

TCC CCA 
Ser Pro 
170 
GAG TAT 
Glu Tyr 
185 

ATG ACC 
Met Thr 



ATC CTC TCC 
He Leu Ser 
45 

TGG ATT GAT 
Trp He Asp 
60 

GAT GTT TGG 
Asp Val Trp 
75 

CTC CAG TCC 
Leu Gin Ser 

TGC CAA CGT 
Cys Gin Arg 

ACC ACC TGC 
Thr Thr Cys 
125 

TCT ATG AGG 
Ser Met Arg 

140 
ACG TTC CCC 
Thr Phe Pro 
155 

GCG CCG AAC 
Ala Pro Asn 



GAG GAC 
Glu Asp 

GAC TGG 
Asp Trp 

AAG CTC 
Lys Leu 
95 

GGG TAC 
Gly Tyr 
110 

CCA TGC 
Pro Cys 

ATC ACT 
He Thr 

ATC AAC 
He Asn 



GTG GAG GTC 
Val Glu Val 

ACT GAC AAC 
Thr Asp Asn 



TAC 
Tyr 

ACG 
Thr 
190 
TTG 
Leu 



TCC 
Ser 
175 
CGG 
Arg 

AAA 
Lys 
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195 200 205 

TGC CCA TGC CAG GTC CCG GCC CCC GAA TTC TTC ACG GAG TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 
5 210 215 220 

GTA CGG CTA CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTA CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 
225 230 235 

70 GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TTC CCG GTT GGG TCG CAG 767 

Glu Val Thr Phe Gin Val Gly Leu Asn Gin Phe Pro Val Gly Ser Gin 
240 245 250 255 

CTC CCA TGC GAG CCC GAA CCG GAT GTA ATA GTG GTC ACC TCC ATG CTC 815 
75 Leu Pro Cys Glu Pro Glu Pro Asp Val Met Val Val Thr Ser Met Leu 

260 265 270 

ACC 
Thr 
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SEQ ID NO: 79 

SEQUENCE LENGTH: 818 base pairs 

SEQUENCE TYPEs nucleic acid 

STRAND EDNESS : double 

TOPOLOGY : linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: H22-8 



818 



GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
1 5 10 15 

ATA GCG TTC GCC TCG CGG GGT AAC CAC GTC TCC CCC ACG CAT TAT GTG 95 
lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAG AGC GAC GCC GCG GCG CGT GTC ACC CAG ATC CTC TCC AGC CTC 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser Ser Leu 

35 40 45 

45 ACC ATC ACT CAG CTG CTG AAG AGG CTC CAC CAG TGG ATT AAT GAG GAC 191 
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Thr lie Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp 

50 55 60 

TGC TCC ACG CCA TGT TCT GGT TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val. Trp Asp Trp 

65 70 75 

ATA TGC ACG GTG TTG AGT GAC TTC AAG ACC TGG CTC CAG TCC AAG CTC 287 
He Cys Thr Val Leu Ser Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

CTG CCG CGG CTA CCG GGA GTC CCT TTC CTT TCA TGC CAA CGT GGG TAC 335 
Leu Pro Arg Leu Pro Gly Val Pro Phe Leu Ser Cys Gin Arg Gly Tyr 

100 105 110 

AAG GGA GTC TGG CGG GGA GAT GGC ATC ATG CAA ACC ACC TGC CCA TGC 383 
Lys Gly Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys 

115 120 125 

GGA GCA CAA ATC GCC GGA CAT GTC AAA AAT GGT TCC ATG AGG ATC ACT 431 
Gly Ala Gin He Ala Gly His Val Lys Asn Gly Ser Met Arg He Thr 

130 135 140 

GGC CCC AGA ACC TGT AGC AAC ACG TGG CAC GGA ACG TTC CCC ATC AAC 479 
Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn 

145 150 155 

GCG TAC ACC ACA GGC CCC TGC ACA CCC TCC CCA GCG CCG AAC TAT TCT 527 
Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
160 165 170 175 

AGG GCG TTG TGG CGG GTA GCT GCT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Arg Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg 

180 185 190 

GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC TTG AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys 

195 200 205 

TGC CCA TGC CAG GTC CCG GCC CCC GAA TTC TTC ACG GAG TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTA CGG CTA CAC AGA TAC GCT CCG GCG TGC AAA CCT CTC CTA CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 

225 230 235 

GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TTC CCG GTT GGG TCG CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Phe Pro Val Gly Ser Gin 



50 



55 



230 



75 



20 



25 



30 



35 



40 



EP0 518 313 A2 




V.' 



818 



240 245 ?5 0 255 

CTC CCA TGC GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC 815 

Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu 

5 260 265 270 

ACC 

Thr 

10 SEQ ID NO: 80 

SEQUENCE LENGTH: 818 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: H22-9 



GG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG. AAC CGG CTG 47 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu 
1 5 10 15 

ATA GCG TTC GCT TCG CGG GGT AAC CAC GTC TCC CCC ACG CAT TAT GTG 95 
He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val 

20 25 30 

CCT GAG AGC GAC GCC GCA GCG CGT GTC ACC CAG ATC CTC TCC AGC CTT 143 
Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu 

35 40 45 

ACC ATC ACT CAG CTG TTG AAG AGG CTC CAC CAG TGG ATT AAT GAT GAC 191 
Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Asp Asp 

50 55 60 

TGC TCC ACG CCA TGT TCT GGT TCG TGG CTC AGG GAT GTT TGG GAC TGG 239 
Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp 

65 70 75 

ATA TGC ACG GTG TTG AGT GAC TTC AAG ACC TGG CTC CAG TCC AAG CTC 287 
lie Cys Thr Val Leu Ser Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu 
80 85 90 95 

45 CTG CCG CGG CTA CCG GGA GTC CCT TTC CTC TCA TGC CAA CGT GGG TAC 335 
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20 



Leu Pro Arg Leu Pro Gly Val Pro Phe Leu Ser Cys Gin Arg Gly Tyr 

100 10$ 110 

AAG GGA GTC TGG CGG GGA GAT GGC ATC ATG CAT ACC ACC TGC CCA TGC 383 
5 Lys Gly Val Trp Arg Gly Asp Gly He Met His Thr Thr Cys Pro Cys 

115 120 125 

GGA GCA CAA ATC GCC GGA CAT GTC AAA AAT GGT TCC ATG AGG ATC ACT 431 
Gly Ala Gin He Ala Gly His Val Lys Asn Gly Ser Met Arg He Thr 
10 130 135 140 

GGC CCC AGA ACC TGT AGC AAC ACG TGG CGC GGA ACG TTC CCC ATC AAC 479 
Gly Pro Arg Thr Cys Ser Asn Thr Trp Arg Gly Thr Phe Pro He Asn 
145 150 155 

r5 GCG TAG ACC ACA GGC CCC TGC ACA CCC TCC CCA GCG CCG AAC TAT TCT 527 

Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser 
160 165 170 175 

AAG GCG TTG TGG CGG GTA GCT GCT GAG GAG TAT GTG GAG GTC ACG CGG 575 
Lys Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg 

180 185 190 

GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC TTG AAA 623 
Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys 

195 200 205 

TGC CCA TGC CAG GTC CCG GCC CCC GAA TTT TTC ACG GAG TTG GAT GGG 671 
Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp Gly 

210 215 220 

GTA CGG CTA CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTA CGG GAT 719 
Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp 

225 230 235 

GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TTC CCG GTT GGG TCG CAG 767 
Glu Val Thr Phe Gin Val Gly Leu Asn Gin Phe Pro Val Gly Ser Gin 
240 245 250 255 

CTA CCA TGC GAG CCC GAA CCG GAT GTA GCA GTG GTC ACC TCC ATG CTC 815 
Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Val Thr Ser Met Leu 

260 265 270 

4° ACC 
Thr 

SEQ ID NO: 81 
45 SEQUENCE LENGTH: 311 base pairs 



25 



30 



35 



818 



50 



55 



232 
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75 



SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
5 ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
10 CLONE: N17-3 

TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC ACC GAC 48 
Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 

1 5 10 15 

CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG ACC AGA GGG 96 
Pro Ser His He Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr Arg Gly 

20 25 30 

TCT CCC CCT TCC TCG ACC AGT TCT TCA GCT AGT CAG TTG TCT GCG CTT 144 
Ser Pro Pro Ser Ser Thr Ser Ser Ser Ala Ser Gin Leu Ser Ala Leu 

35 40 45 

TCT TCG CAG GCA ACA TGC ACT ACC CAT CAG GGC GCC CCA GAC ACT GAC 192 
Ser Ser Gin Ala Thr Cys Thr Thr His Gin Gly Ala Pro Asp Thr Asp 

50 55 60 

CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGA AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
65 70 75 80 

ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC TCT TTT GAA 288 
Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser Phe Glu 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 311 
Pro Leu Arg Ala Glu Glu Asp 

100 



20 



25 



30 



35 



SEQ ID NO: 82 
40 SEQUENCE LENGTH: 311 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
45 ANTI-SENSE: No 



50 



55 
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ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
5 CLONE: N17-1 

TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC ACC GAC 48 
Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 
70 I 5 10 15 

CCC TCC CAC ATC ACA GCA GAG GCG GCT AGG CGT AGG CTG GCC AGA GGG 96 
Pro Ser His lie Thr Ala Glu Ala Ala Arg Arg Arg Leu Ala Arg Gly 

20 25 30 

r5 TCT CCT CCT TCT TCG GCC AGC TCT TCA GCT AGC CAG TTG TCT GCG CCA 144 

Ser Pro Pro Ser Ser Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

35 40 45 

TCT TTG AAG GCG ACA TGT ACT ACC CAT CAA GAC TCC CCA GAC GCT GAC 192 
Ser Leu Lys Ala Thr Cys Thr Thr His Gin Asp Ser Pro Asp Ala Asp 

50 55 60 

CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGG AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
65 70 75 80 

ACC CGC GTG GAG TCA GAG AAC AAG ATA GTG ATT CTA GAC TCT TCT GAA 288 
Thr Arg Val Glu Ser Glu Asn Lys He Val He Leu Asp Ser Ser Glu 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 3X1 
Pro Leu Arg Ala Glu Glu Asp 

100 



20 



25 



30 



35 



SEQ ID NO: 83 

SEQUENCE LENGTH: 311 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
40 ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
*5 CLONE: N17-2 
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10 



15 



20 



25 



30 



35 



40 



TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG CTC ACC GAC 48 
Qys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 

15 10 15 

CCC TCC CAC ATC ACA GCA GAG GCG GCT AGG CGT AGG CTG GCC AGA GGG 96 
P*ro Ser His lie Thr Ala Glu Ala Ala Arg Arg Arg Leu Thr Arg Gly 

20 25 30 

TCT CCT CCT TCT TTG GCC AGC TCT TCA GCT AGT CAG TTG TCT GCG CCA 144 
Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

35 40 45 

TCT TTG AAG GCG ACA TGC ACT ACC CAT CAT GAC TCC CCA GAC GCT GAC 192 
Ser Leu Lys Ala Thr Cys Thr Thr His His Asp Ser Pro Asp Ala Asp 

50 55 60 

C3C ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC GGG AAC ATC 240 
leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 
65 70 75 80 

ACC CGC GTG GAG TTA GAG AAC AAG ATA GTA ATT CTA GAC TCT TTT GAA 288 
Thr Arg Val Glu Leu Glu Asn Lys lie Val He Leu Asp Ser Phe Glu 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 311 
Ptro Leu Arg Ala Glu Glu Asp 

100 

SEQ ID NO: 84 

SEQUENCE LENGTH: 311 base pairs 
SEQUENCE TYPEs nucleic acid 
STRANDEDNESS : double 
TOPOCiOGY: linear 
ANT I— SENSE: No 
ORIGINAL SOURCE 
ORGARISMi Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLOSE: H17-1 



TST GAG CCC GAA CCG GAT GTA ACA GTG CTC ACT TCC ATG CTC ACC GAC 48 
Cps Glu Pro Glu Pro Asp Val Thr Val Leu Thr Ser Met Leu Thr Asp 
15 10 15 

45 CCC TCC CAC ATT ACA GCA GAG ACG GCT AAG CGT AGG CTG GCC AGA GGG 96 
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15 
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25 



30 



35 



40 



45 



Pro Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 

20 25 30 

TCT CCC CCT CCC TTG GCC AGC TCT TCA GCT AGT CAG TTG TCT GCG CCC 144 
Ser Pro Pro Pro Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

35 40 45 

TCC CTG AAG GCG ACA TGC ACT ACC CAT CAT GAC TCC CCG GAC GCT GAC 192 
Ser Leu Lys Ala Thr Cys Thr Thr His His Asp Ser Pro Asp Ala Asp 

50 55 60 

CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGA GGG AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
g 5 70 75 80 

ACC CGT GTG GAG TCA GAG AAC AAG GTA GTA ATT CTG GAC TCT TTC GAC 288 
Thr Arg Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 311 
Pro Leu Arg Ala Glu Glu Asp 

100 

SEQ ID NO: 85 

SEQUENCE LENGTH: 311 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: HI 7 -3 

TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACT TCC ATG CTC ACC GAC 48 
Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met Leu Thr Asp 

1 5 10 15 

CCC TCC CAC ATT ACA GCA GAG GCG GCT GGG CGT AGG CTG GCC AGA GGG 96 
Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly 

20 25 30 

TCT CCC CCT TCC TTG GCC AGC TCT TCA GCT AGT CAG TTG TCT GCG CCC 144 
Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
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35 40 45 

TCT CTG AAG GCG ACA TGC ACT ACC CAT CAT GAC TCC CCG GAC GCT GAC 192 
Ser Leu Lys Ala Thr Cys Thr Thr His His Asp Ser Pro Asp Ala Asp 
5 50 55 60 

CTC ATC GAG GCC AAC CTC CTA TGG CGG CAG GAG ATG GGA GGG AAC ATC 240 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
65 70 75 80 

JO ACC CGC GTG GAG TCA GAG AGC AAG GTA GTA ATT CTG GAC TCT TTC GAC 288 

Thr Arg Val Glu Ser Glu Ser Lys Val Val He Leu Asp Ser Phe Asp 

85 90 95 

CCG CTT CGA GCG GAG GAG GAT G A 311 
75 Pro Leu Arg Ala Glu Glu Asp 

100 

SEQ ID NO: 86 
20 SEQUENCE LENGTH : 740 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 

25 

ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 028-1 

30 

GTG GTA GTC CTG GAC TCG TTG GAG CCG CTT CAA GCG AAG GAA GGT GAG 48 
Val Val Val Leu Asp Ser Leu Glu Pro Leu Gin Ala Lys Glu Gly Glu 

1 5 10 is 

AGG GAA GTG TCC GTT GCG GCG GAG ATC CTG CGG AAG ACC AGG AAA TTC 96 
Arg Glu Val Ser Val Ala Ala Glu He Leu Arg Lys Thr Arg Lys Phe 

20 25 30 

CCC GCA GCG ATG CCC GTA TGG GCA CGC CCG GAC TAC AAC CCA CCA TTA 144 
Pro Ala Ala Met Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 

35 40 45 

CTA GAG TCT TGG AAG AAC CCG GAC TAC GTC CCT CCA GTG GTA CAC GGG 192 
Leu Glu Ser Trp Lys Asn Pro Asp Tyr Val Pro Pro Val Val His Gly 
45 50 55 60 



35 
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TGC CCA TTG 
Cys Pro Leu 
65 

AAG AGA ACG 
Lys Arg Thr 

GAG CTT GCT 
Glu Leu Ala 



CCG CCT 
Pro Pro 



AGC 
Ser 

GCA 
Ala 

GAG 
Glu 
145 
GAG 
Glu 



GGC 
Gly 

GGA 
Gly 
130 
CCG 
Pro 



ACG 
Thr 
115 
TCC 
Ser 

GGG 
Gly 



GTT 
Val 

ACA 
Thr 
100 
GCG 
Ala 



GTC 
Val 
85 
AAG 
Lys 

ACC 
Thr 



ACC AAG GCC 
Thr Lys Ala 
70 

CTG ACA GAA 
Leu Thr Glu 

ACC TTT GGC 
Thr Phe Gly 



GGC CCT 
Gly Pro 



GAC GCT 
Asp Ala 

GAC CCC 
Asp Pro 



GAG GCC 
Glu Ala 



ACA GGC GCC 
Thr Gly Ala 

ATT AAT GCG 
lie Asn Ala 
195 

GCC ACA ACA 
Ala Thr Thr 

210 
GAC AGA CTG 
Asp Arg Leu 
225 

ATG AAG GCC 
Met Lys Ala 



AGC 
Ser 

TTA 
Leu 
180 
CTG 
Leu 



GAG 
Glu 
165 
ATT 
He 

AGC 
Ser 



GAG 
Glu 

GAT 
Asp 
150 
GAC 
Asp 



TCG 
Ser 
135 
CTC 
Leu 

GTC 
Val 



CCT 
Pro 
120 
TAC 
Tyr 

AGC 
Ser 

GTC 
Val 



ACA CCA TGC 
Thr Pro Cys 



TCC CGC 
Ser Arg 

CAA GTC 
Gin Val 

AAG GCG 
Lys Ala 
245 



AAC CCT 
Asn Pro 

AGC GCA 
Ser Ala 
215 
CTG GAT 
Leu Asp 
230 

TCC AC 
Ser 



TTG 
Leu 
200 
AGC 
Ser 

GAC 
Asp 



CCT 
Pro 

TCC 
Ser 

AGT 
Ser 
105 
GAC 
Asp 

TCC 
Ser 

GAC 
Asp 

TGC 
Cys 

GCC 
Ala 
185 
CTG 
Leu 

CAG 
Gin 

CAC 

His 



CCA ATA CCA CCT 
Pro He Pro Pro 
75 

TCC GTG TCC TCT 
Ser Val Ser Ser 
90 

TCC GGA TCG TCG 
Ser Gly Ser Ser 



CAG GCC TCC GCC 
Gin Ala Ser Ala 

125 

TCC ATG CCC CCC 
Ser Met Pro Pro 
140 

GGG TCT TGG TCT 
Gly Ser Trp Ser 
155 

TGC TCG ATG TCC 
Cys Ser Met Ser 
170 

GCG GAG GAG AGC 
Ala Glu Glu Ser 



CGC CAC 
Arg His 

CGG CAG 
Arg Gin 

TAC CGG 
Tyr Arg 
235 



CAC AAC 
His Asn 
205 
AAA AAG 
Lys Lys 
220 

GAC GTG 
Asp Val 



CCA CGA AGA 
Pro Arg Arg 
80 

GCC TTG GCG 
Ala Leu Ala 
95 

GCC GTC GAC 
Ala Val Asp 
110 

GAA GGA GAT 
Glu Gly Asp 

CTT GAG GGA 
Leu Glu Gly 

ACC GTA AGC 
Thr Val Ser 
160 

TAC ACA TGG 
Tyr Thr Trp 
175 

AAG CTG CCC 
Lys Leu Pro 
190 

ATG GTC TAT 
Met Val Tyr 

GTC ACA TTT 
Val Thr Phe 

CTC AAG GAC 
Leu Lys Asp 
240 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



740 



238 
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20 



SEQ ID NO: 87 

SEQUENCE LENGTH: 740 base pairs 

SEQUENCE TYPE: nucleic acid 
5 STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
70 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: 028-2 

75 GTG GTA GTC CTG GAC TCG TTG GAC CCG CTT CGA GCG GAG GAA GAT GAG 48 

Val Val Val Leu Asp Ser Leu Asp Pro Leu Arg Ala Glu Glu Asp Glu 

1 5 io 15 

AGG GAA GTG TCC GTT GCG GCG GAG ATC CTG CGA AAG ACC AAG AAA TTC 96 
Arg Glu Val Ser Val Ala Ala Glu He Leu Arg Lys Thr Lys Lys Phe 

20 25 30 

CCC GCA GCG ATG CCC GTA TGG GCA CGC CCG GAC TAC AAC CCA CCA TTA 144 
Pro Ala Ala Met Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu 

35 40 45 

CTA GAG TCT TGG AAG AAC CCG GAC TAC GTC CCT CCG GTG GTA CAC GGG 192 
Leu Glu Ser Trp Lys Asn Pro Asp Tyr Val Pro Pro Val Val His Gly 

50 55 60 

TGC CCA TTG CCG CCT ACC AAG GCC CCT CCA ATA CCA CCT CCA CGG AGA 240 
Cys Pro Leu Pro Pro Thr Lys Ala Pro Pro He Pro Pro Pro Arg Arg 
65 70 75 80 

AAG AGG ACG GTT GCC CTG ACA GAA TCC ACC GTG TCC TCT GCC TTG GCG 288 
Lys Arg Thr Val Ala Leu Thr Glu Ser Thr Val Ser Ser Ala Leu Ala 

85 90 95 

GAG CTT GCT ACA AAG ACC TTT GGC AGT TCC GGA TCG TCG GCC GTC GAC 336 
Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser Gly Ser Ser Ala Val Asp 

100 105 HO 

AGC GGC ACG GCG ACT GGC CCT CCT GAC CAG GCC TCC GCC GAA GGA GAT 384 
Ser Gly Thr Ala Thr Gly Pro Pro Asp Gin Ala Ser Ala Glu Gly Asp 

H5 120 125 

GCA GGA TCC GAC GCT GAG TCG TAC TCC TCC ATG CCC CCC CTT GAG GGA 432 
Ala Gly Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly 



25 



30 



35 
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10 



15 



20 



25 



130 
GAG CCG 
Glu Pro 
145 

GAG GAG 
Glu Glu 

ACA GGC 
Thr Gly 

ATT AAT 
lie Asn 

GCC ACA 
Ala Thr 
210 
GAC AGA 
Asp Arg 
225 

ATG AAG 
Met Lys 



GGG GAC CCT 
Gly Asp Pro 



GCC 
Ala 

GCC 
Ala 

GCG 
Ala 
195 
ACA 
Thr 



GGC 
Gly 

TTA 
Leu 
180 
CTG 
Leu 



GAG 
Glu 
165 
ATT 
lie 

AGC 
Ser 



TCC CGC 
Ser Arg 



CTG CAA GTC 
Leu Gin Val 

GCC AAG GCG 
Ala Lys Ala 
245 



135 
GAT CTC 
Asp Leu 
150 

GAC GTC 
Asp Val 

ACA CCA 
Thr Pro 

AAC TCT 
Asn Ser 

AGC GCA 
Ser Ala 
215 
CTG GAT 
Leu Asp 
230 

TCC AC 
Ser 



AGC GAC 
Ser Asp 

GTC TGC 
Val Cys 



TGC 
Cys 

TTG 
Leu 
200 
AGC 
Ser 



GCC 
Ala 
185 
CTG 
Leu 

CAG 
Gin 



140 

GGG TCT TGG TCT 
Gly Ser Trp Ser 
155 

TGC TCG ATG TCC 
Cys Ser Met Ser 
170 

GCG GAG GAG AGC 
Ala Glu Glu Ser 



GAC CAC 
Asp His 



CGC CAC 
Arg His 

CGG CAG 
Arg Gin 

TAC CGG 
Tyr Arg 
235 



CAC AAC 
His Asn 
205 
AAA AAG 
Lys Lys 
220 

GAC GTG 
Asp Val 



ACT GTA AGC 
Thr Val Ser 
160 

TAC ACA TGG 
Tyr Thr Trp 
175 

AAG CTG CCC 
Lys Leu Pro 
190 

ATG GTC TAT 
Met Val Tyr 

GTC ACA TTT 
Val Thr Phe 

CTC AAG GAC 
Leu Lys Asp 
240 



480 



528 



576 



624 



672 



720 



740 



SEQ ID NO: 88 

30 

SEQUENCE LENGTH: 740 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 

35 

ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
40 CLONE: 028-4 



45 



GTG GTA GTC CTG GAC TCG TTG GAC CCG CTT CGA GCG GAG GAA GAT GAG 48 
Val Val Val Leu Asp Ser Leu Asp Pro Leu Arg Ala Glu Glu Asp Glu 
1 5 10 15 
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AGG GAA GTG 
Arg Glu Val 



10 



15 



20 



25 



30 



35 



40 



ccc 

Pro 

CTA 
Leu 

TGC 
Cys 
65 
AAG 
Lys 



GCA 
Ala 

GAG 
Glu 
50 
CCA 
Pro 



GCG 
Ala 
35 
TCT 
Ser 

TTG 
Leu 



TCC GTT 
Ser Val 

20 
ATG CCC 
Met Pro 

TGG AAG 
Trp Lys 

CCG CCT 
Pro Pro 



GCG GCG GAG 
Ala Ala Glu 



GTA TGG 
Val Trp 



AGG ACG 
Arg Thr 



45 



GAG CTT GCT 
Glu Leu Ala 

AGC GGC ACG 
Ser Gly Thr 
115 

GCA GGA TCC 
Ala Gly Ser 

130 
GAG CCG GGG 
Glu Pro Gly 
145 

GAG GAG GCC 
Glu Glu Ala 

ACA GGC GCC 
Thr Gly Ala 

ATT AAT GCG 
lie Asn Ala 
195 

GCC ACA ACA 



GTT 
Val 

ACA 
Thr 
100 
GCG 
Ala 



GTC 
Val 
85 
AAG 
Lys 

ACC 
Thr 



AAC 
Asn 

ATC 
He 
70 
CTG 
Leu 



CCG 
Pro 
55 
AAG 
Lys 

ACA 
Thr 



GCA 
Ala 
40 
GAC 
Asp 

GCC 
Ala 

GAA 
Glu 



ATC CTG 
He Leu 
25 

CGC CCG 
Arg Pro 

TAC GTC 
Tyr Val 

CCT CCA 
Pro Pro 



CGA AAG ACC 
Arg Lys Thr 



ACC TTT GGC 
Thr Phe Gly 



GGC CCT 
Gly Pro 



GAC GCT 
Asp Ala 

GAC CCT 
Asp Pro 

GGC GAG 
Gly Glu 
165 
TTA ATT 
Leu He 
180 

CTG AGC 
Leu Ser 

TCC CGC 



GAG 
Glu 

GAT 
Asp 
150 
GAC 
Asp 



TCG 
Ser 
135 
CTC 
Leu 

GTC 
Val 



CCT 
Pro 
120 
TAC 
Tyr 

AGC 
Ser 

GTC 
Val 



TCC 
Ser 

AGT 
Ser 
105 
GAC 
Asp 



ACC 
Thr 
90 
TCC 
Ser 

CAG 
Gin 



ACA CCA TGC 
Thr Pro Cys 

AAC TCT TTG 
Asn Ser Leu 
200 

AGC GCA AGC 



TCC TCC 
Ser Ser 

GAC GGG 
Asp Gly 

TGC TGC 
Cys Cys 
170 
ACC GCG 
Thr Ala 
185 

CTG CGT 
Leu Arg 

CAG CGG 



GAC TAC AAC 
Asp Tyr Asn 
45 

CCT CCG GTG 
Pro Pro Val 
60 

ATA CCA CCT 
He Pro Pro 
75 

GTG TCC TCT 
Val Ser Ser 

GGA TCG TCG 
Gly Ser Ser 

GCC TCC GCC 
Ala Ser Ala 
125 

ATG CCC CCC 
Met Pro Pro 

140 
TCT TGG TCT 
Ser Trp Ser 
155 

TCG ATG TCC 
Ser Met Ser 

GAG GAG AGC 
Glu Glu Ser 

CAC CAC AAC 
His His Asn 
205 

CAG AAA AAG 



AAG AAA TTC 96 
Lys Lys Phe 
30 

CCA CCA TTA 144 
Pro Pro Leu 

GTA CAC GGG 192 
Val His Gly 

CCA CGG AGA 240 
Pro Arg Arg 
80 

GCC TTG GCG 288 
Ala Leu Ala 
95 

GCC GTC GAC 336 
Ala Val Asp 
110 

GAA GGA GAT 384 
Glu Gly Asp 

CTT GAG GGA 432 
Leu Glu Gly 

ACT GTA AGC 480 
Thr Val Ser 
160 

TAC ACA TGG 528 
Tyr Thr Trp 
175 

AAG CTG CCC 576 
Lys Leu Pro 
190 

ATG GTC TAT 624 
Met Val Tyr 

GTC ACA TTT 672 



50 



55 



241 




Ala Thr Thr Ser Arg Serfe ^^l^-^^ln-^^^p^i. Thr Phe 

210 215 220 

GAC AGA CTG CAA GTC CTG GAT GAC CAC TAC CGG GAC GTG CTC AAG GAC 
Asp Arg Leu Gin Val Leu Asp Asp His Tyr Arg Asp Val Leu Lys Asp 

225 230 235 240 

ATG AAG GCC AAG GCG TCC AC 
Met Lys Ala Lys Ala Ser 

245 




JSP" 



10 




720 



740 



75 



20 



SEQ ID NO: 89 

SEQUENCE LENGTH: 515 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N29-1 



25 



30 



35 



40 



45 



AC TAC CGG GAC GTG CTG AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT 47 
Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
15 10 15 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG CTG ACG CCC CCA 95 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

20 25 30 

CAC TCG GCC AGA TCT AAA TTT GGC TAC GGG GCA AAG GAC GTC CGG AGC 143 
His Ser Ala Arg Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ser 

35 40 45 

CTG TCC AGC AAG GCC GTT AAC CAC ATC CGC TCC GTG TGG AAG GAC TTG 191 
Leu Ser Ser Lys Ala Val Asn His He Arg Ser Val Trp Lys Asp Leu 

50 55 60 

CTG GAA GAC ACT GAG ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT 239 
Leu Glu Asp Thr Glu Thr Pro He Asp Thr Thr He Met Ala Lys Asn 

65 70 75 

GAG GTT TTC TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 287 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 



50 



55 



242 



EP 0 518 313 A2 



70 



75 



20 



25 



30 



35 



40 



80 85 90 95 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 335 
Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

100 105 HO 

CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 383 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

115 120 125 

TAC GGA TTC CAG TAC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 431 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

130 135 140 

GCC TGG AAG TCA AAG AAG AGC CCT ATG GGC TTT GCA TAT GAC ACC CGC 479 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala Tyr Asp Thr Arg 

145 150 155 

TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT 515 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg 
160 165 170 

SEQ ID NO: 90 

SEQUENCE LENGTH: 515 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N29-2 



AC TAC CGG GAC GTG CTG AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT 47 
Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1 5 10 15 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGT AAG CTG ACG CCC CCA 95 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

20 25 30 

CAC TCG GCC AGA TCT AAA TTT GGC TAC GGG GCA AAG GAC GTC CGG AGC 143 
His Ser Ala Arg Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ser 
45 35 40 45 
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70 



20 



CTG TCC AGC AAG GCC GTT AAC CAC ATC CGC TCC GTG TGG AAG GAC TTG 191 
Leu Ser Ser Lys Ala Val Asn His lie Arg Ser Val Trp Lys Asp Leu 

50 55 60 

CTG GAA GAC ACT GAG ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT 239 
Leu Glu Asp Thr Glu Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 

65 70 75 

GAG GTT TTC TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 287 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
80 85 90 95 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 335 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 
75 100 105 110 

CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 383 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

115 120 125 

TAC GGA TTC CAG TAC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 431 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

130 135 140 

GCC TGG AAG TCA AAG AAG AGT CCT ATG GGC TTT GCA TAT GAC ACC CGC 479 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ala Tyr Asp Thr Arg 

145 150 155 

TGT TTT GAC TCA ACG GTC ACC GAG AAC GAC ATC CGT 515 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg 
160 165 170 

30 

SEQ ID NO: 91 

SEQUENCE LENGTH: 503 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N29-3 

45 AC TAC CGG GAC GTG CTG AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT 47 
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Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
15 10 15 

AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGT AAG CTG ACG CCC CCA 95 
5 Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

20 25 30 

CAC TCG GCC AGA TCT AAG TTT GGC TAC GGG GCA AAG GAC GTC CGG AGC 143 
His Ser Ala Arg Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ser 
10 35 40 45 

CTG TCC AGC AAG GCC GTT AAC CAC ATC CGC TCC GTG TGG AGG GAC TTG 191 
Leu Ser Ser Lys Ala Val Asn His lie Arg Ser Val Trp Glu Asp Leu 
50 55 60 

75 CTG GAA GAC ACT GAA ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT 239 

Leu Glu Asp Thr Glu Thr Pro lie Asp Thr Thr He Met Ala Lys Asn 

65 70 75 

GAG GTT TTC TGT GTT CAA CCA GAG AAA GGA GGC CGC AAG CCA GCT CGC 287 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
80 85 90 95 

CTT ATC GTA TTC CCA GAC TTG GGG GTT CGT GTG TGC GAG AAA ATG GCC 335 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

100 105 110 

CTC TAC GAC GTG GTC TCC ACT CTT CCT CAG GCC GTG ATG GGC TCC TCA 383 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser Ser 

115 120 125 

TAC GGA TTC CAG TAC TCC CCT GGA CAG CGG GTC GAG TTC CTG GTG AAT 431 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

130 135 140 

GCC TGG AAG TCA AAG AAG AGT CCT ATG GGC TTT TCA TAT GAC ACC CGC 479 
Ala Trp Lys Ser Lys Lys Ser Pro Met Gly Phe Ser Tyr Asp Thr Arg 

145 150 155 

TGT TTT GAC TCA ACG GTC ACC GAG 503 
Cys Phe Asp Ser Thr Val Thr Glu 
160 165 
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SEQ ID NO: 92 

SEQUENCE LENGTH: 401 base pairs 
SEQUENCE TYPE: nucleic acid 
45 STRANDEDNESS : double 
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20 



TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
5 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: N18-4 

70 TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACG GTC ACT GAG 47 

Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AAT GAC ATC CGT ACT GAG GAG TCA ATT TAT CAA TGT TGT GAC TTG GAC 95 

Asn Asp He Arg Thr Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Asp 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC TTG ACC AAT TCA AAA GGG CAA AAC TGC GGC TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GCC AGC GGC GTG CTG ACG ACT AGO TGC GGT AAT ACC CTC ACA 239 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAG TTG AAG GCC TCT GCA GCC TGT CGA GCT GCG AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACG ATG CTC GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGA ACC CAG GAG GAC GCG GCA AAC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Asn Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
130 
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SEQ ID NO: 93 
45 SEQUENCE LENGTH: 401 base pairs 
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SEQUENCE TYPE: nucleic acid 

STRAND EDNESS : double 

TOPOLOGY : linear 
5 ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
10 CLONE: N18-2 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACA GTC ACT GAG 47 
Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
75 1 5 10 15 

AAC GAC ATC CGT ATT GAG GAG TCA ATT TAT CAA TGC TGT GAC TTG GTC 95 
Asn Asp He Arg He Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Val 

20 25 30 

20 CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 

Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC TTG ACC AAT TCA AAA GGG CAA AAC TGC GGC TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GCC AGC GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA 239 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCG AAG CTC CGG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Arg Asp 
80 85 90 95 

35 TGC ACG ATG CTC GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC 335 

Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGG ACC CAG GAG GAC GCG GCA AGC CTA CGA GTC TTC ACG GAG GCT 383 
^ Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
130 

45 
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SEQ ID NO: 94 

SEQUENCE LENGTH: 401 base pairs 

SEQUENCE TYPE: nucleic acid 
5 STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 
W ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: N18-3 

75 TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACG GTC ACT GAG 47 

Gly lie Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AAT GAC ATC CGT ACT GAG GAG TCA ATT TAT CAA TGT TGT GAC TTG GAC 95 
20 Asn Asp lie Arg Thr Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Asp 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC TTG ACC AAT TCA AAA GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GCC AGC GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTT ACA 239 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCG AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACG ATG CTC GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGA ACC CAG GAG GAC GCG GCA AAC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Asn Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
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20 



130 

SEQ ID NO: 95 
5 SEQUENCE LENGTH: 401 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
w ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
7g CLONE: HI 8-1 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACA GTC ACT GAG 47 
Gly lie Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AGT GAT ATC CGT GTT GAG GAG TCA ATC TAC CAA TGT TGT GAC TTG GCC 95 
Ser Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GTC AGC GGC GTG CTG ACG ACC AGC TGC GGT AAT ACT CTT ACA 239 
Cys Arg Val Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 , 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACA ATG CTC GTG TGC GGG GAC GAC CTT GTC GTC ATC TGT GAG AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGA ACC CAG GAG GAC GCG GCG AAC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Asn Leu Arg Val Phe Thr Glu Ala 
45 115 120 125 
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ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
130 

SEQ ID NO: 96 

SEQUENCE LENGTH: 401 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: H18-2 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACA GTC ACT GAG 47 
Gly lie Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
15 10 15 

AGT GAT ATC CGT GTT GAG GAG TCA ATC TAC CAA TGT TGT GAC TTG GCC 95 
Ser Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Arg Leu Tyr lie 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAG GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GTC AGC GGC GTG CTG ACG ACC AGC TGC GGT AAT ACC CTT ACA 239 
Cys Arg Val Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
40 80 85 90 95 

TGC ACA ATG CTC GTG TGC GGG GAC GAC CTT GTC GTC ATC TGT GAA AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu Ser 

100 105 110 

45 GCG GGA ACC CAG GAG GAC GCG GCG AAC CTA CGA GTC TTC ACG GAG GCT 383 
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Ala Gly Thr Gin Glu Asp Ala Ala Asn Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
5 Met Thr Arg Asn Ser Ala 

130 

SEQ ID NO: 97 
1Q SEQUENCE LENGTH: 401 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: HI 8-3 

20 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACA GTC ACC GAG 47 
Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AGT GAT ATC CGT GTT GAG GAG TCA ATC TAC CAA TGT TGT GAC TTG GCC 95 
Ser Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCT ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GTC AGC GGC GTG CTG ACG ACC AGC TGC GGT AAT ACC CTT ACA 239 
Cys Arg Val Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 
65 70 75 

40 TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 

Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACA ATG CTC GTG TGC GGG GAC GAC CTT GTC GTC ATC TGT GAA AGC 335 
45 Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 
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100 105 110 

GCG GGA ACC CAG GAG GAC GCG GCG AAC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Asn Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACC AGG AAT TCC GCC 401 
Met Thr Arg Asn Ser Ala 
130 

SEQ ID NO:98 

SEQUENCE LENGTH: 1171 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: O30-3 

TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACG GTC ACT GAG 47 
Gly lie Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AAT GAC ATC CGT GTC GAG GAG TCA ATT TAC CAA TGT TGT GAC TTG GCC 95 
Asn Asp lie Arg Val Glu Glu Ser lie Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCA CTC ACA GAG CGG CTT TAC ATC 143 
Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Arg Leu Tyr lie 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAG GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GTC AGC GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA 239 
Cys Arg Val Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
45 30 85 90 95 
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TGC ACG ATG CTT GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAT AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Asp Ser 

100 105 no 

GCG GGA ACT CAG GAG GAC GCG GCG AGC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACT AGG TAG TCT GCC CCC CCC GGG GAC CCG CCC CAA CCA GAA TAC 431 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr 

130 135 140 

GAC TTG GAG CTG ATA ACA TCA TGT TCC TCC AAT GTG TCG GTC GCG CAC 479 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 

145 150 155 

GAC GCA TCA GGC AAA CGG GTG TAC TAT CTC ACC CGT GAC CCC ACC ACC 527 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
160 165 170 175 

CCC CTA GCG CGG GCT GCG TGG GAG ACA GCT AGA CAC ACT CCA GTC AAC 575 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

180 185 190 

TCC TGG CTA GGC AAC ATC ATC ATG TAC GCG CCC ACC TTA TGG GCA AGG 623 
Ser Trp Leu Gly Asn He He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

195 200 205 

ATG ATT CTG ATG ACC CAC TTC TTC TCC ATC CTT CTA GCC CAG GAG CAA 671 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 

210 215 - 220 

CTT GAA AAA GCC CTA GAT TGT CAG ATC TAC GGG GCC ACT TAC TCC ATT 719 
Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Thr Tyr Ser He 

225 230 235 

GAG CCA CTT GAC CTA CCT CAG ATC ATT CAA CGA CTC CAC GGT CTT AGC 767 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
240 245 250 255 

GCA TTT TCA CTC CAT AGT TAC TCT CCA GGT GAG ATC AAT AGG GTG GCT 815 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

260 265 270 

TCA TGC CTC AGG AAA CTT GGG GTA CCG CCC TTG CGA GTC TGG AGA CAT 863 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

275 280 285 

CGG GCC AGA AGC GTC CGC GCT AAG CTA CTG TCC CAG GGG GGG AGG GCC 911 
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Arg Ala Arg Ser Val Arg Ala Lys Leu Leu Ser Gin Gly Gly Arg Ala 

290 295 300 

GCC ACC TGT GGC AAA TAC CTC TTC AAC TGG GCA GTA AAG ACC AAG CTC 959 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 

305 310 315 

AAA CTC ACT CCA ATC CCA GAA GCG TCC CAG CTG GAC TTG TCC GGC TGG 1007 
Lys Leu Thr Pro lie Pro Glu Ala Ser Gin Leu Asp Leu Ser Gly Trp 
320 325 330 335 

TTC GTT GCT GGT TAC AGC GGG GGA GAC ATA TAT CAC AGC CTG TCT CGT 1055 
Phe Val Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Leu Ser Arg 

340 345 350 

GCC CGA CCC CGC TGG TTC ATG TGG TGC CTA CTC CTA CTT TCC GTA GGG 1103 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

355 360 365 

GTA GGC ATC TAC CTG CTC CCC AAC CGA TGA GCG GGG AGC TAA ACA CTC 1151 
Val Gly lie Tyr Leu Leu Pro Asn Arg StopAla Gly Ser StopThr Leu 

370 375 380 

CAG GCC AAT AGG CCA TCC C C 1171 
Gin Ala Asn Arg Pro Ser 
385 

SEQ ID NO: 99 

SEQUENCE LENGTH: 1170 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: O30-2 

40 G GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACG GTC ACT GAG 46 

Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 15 

AAT GAC ATC CGT GTT GAG GAG TCA ATT TAC CAA TGT TGT GAC TTG GCC 94 
Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 
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20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAC ATC 142 
Pro Glu Ala Arg Gin Ala lie Arg Ser Leu Thr Glu Arg Leu Tyr lie 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC GGC TAT CGC CGG 190 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 
50 55 60 

TGC CGC GTC AGC GGC GTG CTG ACG ACT AGC TGC GGC AAT ACC CTC ACA 238 
Cys Arg Val Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 286 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACG ATG CTT GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC 334 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 110 

GCG GGA ACT CAG GAG GAC GCG GCG AGC CTA CGA GTC TTC ACG GAG GCT 382 
Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala 

115 120 125 

ATG ACT AGG TAC TCT GCC CCC CCC GGG GAC CCG CCC CAA CCA GAA TAC 430 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr 

130 135 140 

GAC TTG GAG CTG ATA ACA TCA TGC TCC TCC AAC GTG TCG GTC GCG CAC 478 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 

145 150 155 

GAC GCA TCA GGC AAA CGG GTG TAC TAC CTC ACC CGT GAC CCC ACC ACC 526 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
l fi0 165 170 175 

CCC CTT GCG CGG GCT GCG TGG GAG ACA GCT AGA CAC ACT CCA GTC AAC 574 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

180 185 190 

TCC TGG CTA GGC AAC ATC ATC ATG TAT GCG CCC ACC TTA TGG GCA AGG 622 
Ser Trp Leu Gly Asn lie He Met Tyr Ala Pro Thr Leu Trp Ala Arg 

195 200 205 

ATG ATT CTG ATG ACC CAC TTC TTC TCC ATC CTT CTA GCC CAG GAG CAA 670 
Met He Leu Met Thr His Phe Phe Ser He Leu Leu Ala Gin Glu Gin 
45 210 215 220 
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CTT GAA AAA GCC CTA GAT TGT CAG ATC TAT GGG GCC ACT TAC TCC ATT 718 

Leu Glu Lys Ala Leu Asp Cys Gin He Tyr Gly Ala Thr Tyr Ser He 

225 230 235 

GAG CCA CTT GAC CTA CCT CAG ATC ATT CAA CGA CTC CAT GGT CTT AGC 766 
Glu Pro Leu Asp Leu Pro Gin He He Gin Arg Leu His Gly Leu Ser 
24 0 245 250 255 

GCA TTT TCA CTC CAT AGT TAC TCT CCA GGT GAG ATC AAT AGG GTG GCT 814 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

260 265 270 

TCA TGC CTC AGG AAA CTT GGG GTA CCG CCC TTG CGA GTC TGG AGA CAT 862 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

275 280 285 

CGG GCC AGA AGC GTC CGC GCT AAG CTA CTG TCC CAG GGG GGG AGG GCC 910 
Arg Ala Arg Ser Val Arg Ala Lys Leu Leu Ser Gin Gly Gly Arg Ala 

290 295 300 

GCC ACC TGT GGC AAA TAC CTC TTC AAC TGG GCA GTA AAG ACC AAG CTC 958 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 

305 310 315 

AAA CCC ACT CCA ATC CCG GAA GCG TCC CAG CTG GAC TTG TCC GGC TGG 1006 
Lys Pro Thr Pro He Pro Glu Ala Ser Gin Leu Asp Leu Ser Gly Trp 
3 20 325 330 335- 

TTC GTT GCT GGT TAC AGC GGG GGA GAC ATA TAT CAC AGC CTG TCT CGT 1054 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

340 345 350 

GCC CGA CCC CGC TGG TTT ATG TGG TGC CTA CTC CTA CTT TCC GTA GGG 1102 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

355 360 365 

GTA GGC ATC TAC CTG CTC CCC AAC CGA TGA GCG GGG AGC TAA ACA CTC 1150 
Val Gly He Tyr Leu Leu Pro Asn Arg StopAla Gly Ser StopThr Leu 

370 375 380 

CAG GCC AAT AGG CCA TCC C C 1170 
Gin Ala Asn Arg Pro Ser 
385 



SEQ ID NO: 100 

SEQUENCE LENGTH: 1171 base pairs 
SEQUENCE TYPE: nucleic acid 
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STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: O30-4 



TG GGG ATC CCG TAT GAT ACC CGC TGC TTT GAC TCA ACA GTC ACT GAG 47 
Gly He Pro Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
1 5 10 X5 

75 AAT ATC CGT GTT GAG GAG TCA ATT TAC CAA TGT TGT GAC TTG GCC 95 

Asn Asp He Arg Val Glu Glu Ser He Tyr Gin Cys Cys Asp Leu Ala 

20 25 30 

CCC GAG GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAC ATC 143 
Pro Glu Ala Arg Gin Ala He Arg Ser Leu Thr Glu Arg Leu Tyr He 

35 40 45 

GGG GGC CCC CTG ACT AAT TCA AAG GGG CAG AAC TGC GGT TAT CGC CGG 191 
Gly Gly Pro Leu Thr Asn Ser Lys Gly Gin Asn Cys Gly Tyr Arg Arg 

50 55 60 

TGC CGC GCC AGC GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA 239 
Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr 

65 70 75 

TGT TAC TTG AAG GCC TCT GCA GCC TGT CGA GCT GCA AAG CTC CAG GAC 287 
Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala Ala Lys Leu Gin Asp 
80 85 90 95 

TGC ACG ATG CTT GTG TGC GGA GAC GAC CTT GTC GTT ATC TGT GAA AGC 335 
Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser 

100 105 no 

GCG GGA ACT CAG GAG GAC GCG GCG AGC CTA CGA GTC TTC ACG GAG GCT 383 
Ala Gly Thr Gin Glu Asp Ala Ala Ser Leu Arg Val Phe Thr Glu Ala 

120 125 
ATG ACT AGG TAC TCT GCC CCC CCC GGG GAC CCG CCC CAA CCA GAA TAC 431 
Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr 

130 135 140 

GAC TTG GAG CTG ATA ACA TCA TGC TCC TCC AAT GTG TCG GTC GCG CAC 479 
Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
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145 150 155 

GAC GCA TCA GGC AAA CGG GTG TAC TAT CTC ACC CGT GAC CCC CCC ACC 527 
Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Pro Thr 
160 165 170 175 

CCC CTT GCG CGG GCT GCG TGG GAG ACA GCT AGA CAC ACT CCA GTC AAC 575 
Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn 

180 185 190 

TCC TGG CTA GGC AAC ATC ATC ATG TAC GCG CCC ACC TTA TGG GCA AGG 623 
Ser Trp Leu Gly Asn lie lie Met Tyr Ala Pro Thr Leu Trp Ala Arg 

195 200 205 

ATG ATT CTG ATG ACC CAC TTC TTC TCC ATC CTT CTA GCC CAG GAG CAA 671 
Met lie Leu Met Thr His Phe Phe Ser lie Leu Leu Ala Gin Glu Gin 

210 215 220 

CTT GAA AAA GCC CTA GAT TGT CAG ATC TAC GGG GCC ACT TAC TCC ATT 719 
Leu Glu Lys Ala Leu Asp Cys Gin lie Tyr Gly Ala Thr Tyr Ser lie 

225 230 235 

GAG CCA CTT GAC CTA CCT CAG ATC ATT CAA CGA CTC CAT GGT CTT AGC 767 
Glu Pro Leu Asp Leu Pro Gin lie He Gin Arg Leu His Gly Leu Ser 
240 245 250 255 

GCA TTT TCA CTC CAT AGT TAC TCT CCA GGT GAG ATC AAT AGG GTG GCT 815 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 

260 265 270 

TCA TGC CTC AGG AAA CTT GGG GTA CCG CCC TTG CGA GTC TGG AGA CAT 863 
Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Val Trp Arg His 

275 280 . 285 

CGG GCC AGA AGC GTC CGC GCT AAG CTA CTG TCC CAG GGG GGG AGG GCC 911 
Arg Ala Arg Ser Val Arg Ala Lys Leu Leu Ser Gin Gly Gly Arg Ala 

290 295 300 

GCC ACC TGT GGC AAA TAC CTC TTC AAC TGG GCA GTA AAG ACC AAG CTC 959 
Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 

305 310 315 

AAA CTC ACT CCA ATC CCG GAA GCG TCC CAG CTG GAC TTG TCC GGC TGG 1007 
Lys Leu Thr Pro He Pro Glu Ala Ser Gin Leu Asp Leu Ser Gly Trp 
320 325 330 335 

TTC GTT GCT GGT TAC AGC GGG GGA GAC ATA TAT CAC AGC CTG TCT CGT 1055 
Phe Val Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Leu Ser Arg 

340 345 350 
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GCC CGA CCC CGC TGG TTC ATG TGG TGC CTA CTC CTA CTT TCC GTA GGG 1103 
Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu Leu Leu Ser Val Gly 

355 360 365 

5 GTA GGC ATC TAG CTG CTC CCC AAC CGA TGA GCG GGG AGC TAA ACA CTC 1151 

Val Gly He Tyr Leu Leu Pro Asn Arg StopAla Gly Ser StopThr Leu 

370 375 380 

CAG GCC AAT AGG CCA TCC C C 1 171 
70 Gin Ala Asn Arg Pro Ser 

385 

SEQ ID NO: 101 
75 SEQUENCE LENGTH: 7911 base pairs 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
ANTI-SENSE: No 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 
IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: T7N1-30 
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ACTAGTTAAT ACGACTCACT ATAGGGTGCC AGCCCCCTGA TGGGGGCGAC ACTCCACCAT 60 
AGATCACTCC CCTGTGAGGA ACTACTGTCT TCACGCAGAA AGCGTCTAGC CATGGCGTTA 120 
GTATGAGTGT CGTGCAGCCT CCAGGACCCC CCCTCCCGGG AGAGCCATAG TGGTCTGCGG 180 
AACCGGTGAG TACACCGGAA TTGCCAGGAC GACCGGGTCC TTTCTTGGAT CAACCCGCTC 240 
AATGCCTGGA GATTTGGGCG TGCCCCCGCG AGACTGCTAG CCGAGTAGTG TTGGGTCGCG 300 
AAAGGCCTTG TGGTACTGCC TGATAGGGTG CTTGCGAGTG CCCCGGGAGG TCTCGTAGAC 360 
CGTGCATC ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ATC AAA CGT AAC 410 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys He Lys Arg Asn 
15 10 
ACC AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGC GGT GGT CAG ATC 458 
Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 
15 20 25 30 

GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT GTG 506 
Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro. Arg Leu Gly Val 

35 40 45 

CGC GCG ACT AGG AAG ACT TCC GAG CGG CCG CAA CCT CGT GGA AGG CGA 554 
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Arg Ala Thr Arg Lys Thr Ser Glu Arg Pro Gin Pro Arg Gly Arg Arg 

50 55 60 

CAA CCT ATC CCC AAG GCT CGC CAA CCC GAG GGT AGG GCC TGG GCT CAG 602 
5 Gin Pro He Pro Lys Ala Arg Gin Pro Glu Gly Arg Ala Trp Ala Gin 

65 70 75 

CCC GGG TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TTG GGG TGG GCA 650 
Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala 
10 80 85 gO 

GGA TGG CTC CTG TCA CCC CGC GGC TCC CGG CCT AGT TGG GGC CCC ACG 698 
Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr 
95 100 105 HO 

75 GAC CCC CGG CGT AGG TCG CGT AAT TTG GGT AAG GTC ATC GAT ACC CTC 746 

Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu 

H5 120 125 

ACA TGC GGC TTC GCC GAC CTC ATG GGG TAC ATT CCG CTC GTC GGC GCC 794 
20 Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 

130 135 140 

CCC CTA GGG GGC GCT GCC AGG GCT CTA GCG CAT GGC GTC CGG GTT CTG 842 
Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 

145 150 155 

GAG GAC GGC GTG AAC TAT GCA ACA GGG AAT CTG CCT GGT TGC TCC TTT 890 
Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 

160 165 170 

TCT ATC TTC CTT TTG GCT TTG CTG TCC TGT TTG ACC ATC CCA GCT TCC 938 
Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr He Pro Ala Ser 
175 180 185 190 

GCC TAC CAA GTG CGC AAC GCG TCC GGG GTG TAC CAT GTC ACG AAC GAC 986 
Ala Tyr Gin Val Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

195 200 205 

TGC TCC AAC TCA AGT ATT GTG TAT GAG GCG GCG GAC GTG ATT ATG CAC 1034 
Cys Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Val He Met His 

210 215 220 

ACC CCC GGG TGC GTG CCC TGC GTC CGG GAG AAC AAT TCC TCC CGC TGC 1082 
Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys 

225 230 235 

TGG GTA GCG CTC ACT CCC ACG CTT GCG GCC AGG AAC AGC AGC ATC CCC 1130 
Trp val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser Ser He Pro 
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240 245 250 

ACT ACG ACA ATA CGG CGT CAT GTC GAC TTG CTC GTT GGG GCA GCT GCT 1178 
Thr Thr Thr He Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala 
255 260 265 270 

CTC TGT TCC GCT ATG TAT GTG GGG GAT TTT TGC GGA TCT GTT TTC CTC 1226 
Leu Cys Ser Ala Met Tyr Val Gly Asp Phe Cys Gly Ser Val Phe Leu 

275 280 285 

GTC TCC CAG CTG TTC ACT TTC TCA CCT CGC CGG TAT GAG ACG GTG CAA 1274 
Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu Thr Val Gin 

290 295 300 

GAC TGC AAT TGC TCA ATC TAT CCC GGC CAT GTA TCA GGC CAT CGC ATG 1322 
75 Asp Cys Asn Cys Ser He Tyr Pro Gly His Val Ser Gly His Arg Met 

305 310 315 

GCT TGG GAT ATG ATA ATG AAT TGG TCA CCT ACA ACA GCC CTA GTG GTA 1370 
Ala Trp Asp Met He Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val 
20 320 325 330 

TCG CAG CTA CTC CGG ATC CCA CAA GCC GTG GTG GAT ATG GTG GCA GGG 1418 
Ser Gin Leu Leu Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly 
335 340 345 350 

25 GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC TAC TAT TCC ATG GTG GGG 1466 

Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly 

355 360 365 

AAC TGG GCT AAG GTC TTG, GTT GTG ATG CTG CTC TTC GCC GGT GTT GAC 1514 
Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala Gly Val Asp 

370 375 380 

GGG GGG ACC CAC GTG ACA GGG GGG AAG GTA GCC TAC ACC ACC CAG GGC 1562 
Gly Gly Thr His Val Thr Gly Gly Lys Val Ala Tyr Thr Thr Gin Gly 

385 390 395 

TTT ACA TCC TTC TTT TCA CGA GGG CCG TCT CAG AAA ATC CAA CTT GTA 1610 
Phe Thr Ser Phe Phe Ser Arg Gly Pro Ser Gin Lys He Gin Leu Val 

400 405 410 

AAC ACT AAC GGC AGC TGG CAC ATC AAT AGG ACT GCC CTC AAT TGC AAT 1658 
Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn 
415 420 425 430 

GAC TCC CTT AAC ACC GGG TTC CTT GCC GCG CTG TTC TAC ACC CAC AGC 1706 
Asp Ser Leu Asn Thr Gly Phe Leu Ala Ala Leu Phe Tyr Thr His Ser 

435 440 445 
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TTC AAC GCG TCC GGA TGT CCG GAG CGT ATG GCC GGT TGC CGC CCC ATT 1754 
Phe Asn Ala Ser Gly Cys Pro Glu Arg Met Ala Gly Cys Arg Pro lie 

450 455 460 

GAG GAG TTC GCT CAG GGG TGG GGT CCC ATC ACT CAT GTT GTG CCT AAC 1802 
Asp Glu Phe Ala Gin Gly Trp Gly Pro lie Thr His Val Val Pro Asn 

465 470 475 

ATC TCG GAC CAG AGG CCC TAT TGC TGG CAC TAG GCG CCT CGA CCG TGT 1850 
lie Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys 

480 485 490 

GGT ATC GTA CCC GCG TCG CAG GTG TGT GGT CCG GTG TAT TGC TTC ACC 1898 
Gly lie Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr 
495 500 505 510 

CCA AGC CCT GTT GTG GTG GGG ACG ACC GAT CGT TTC GGC GCC CCC ACG 1946 
Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Ala Pro Thr 

515 520 525 

TAC AAC TGG GGA AAC AAT GAG ACG GAT GTG CTA CTC CTC AAC AAC ACA 1994 
Tyr Asn Trp Gly Asn Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr 

530 535 540 

CGG CCG CCG CAG GGC AAC TGG TTC GGT TGT ACC TGG ATG AAT GGC ACT 2042 
Arg Pro Pro Gin Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr 

545 550 555 

GGG TTC ACA AAG ACG TGC GGG GGC CCC CCG TGC AAC ATC GGG GGG GTC 2090 
Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn lie Gly Gly Val 

560 565 570 

GGC AAC AAT ACC TTG ACT TGC CCC ACG GAC TGC TTC CGG AAG CAC CCC 2138 
Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro 
575 580 585 590 

GAG GCC ACT TAC ACA AAA TGT GGT TCG GGG CCT TGG TTG ACA CCT AGG 2186 
Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg 

595 600 605 

TGC CTA GTT CAT TAC CCA TAC AGG CTC TGG CAC TAT CCC TGC ACT GTC 2234 
Cys Leu Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val 
40 610 615 620 

AAC TTT ACC ATC TTC AAG GTT AGG ATG TAT GTG GGG GGC GTG GAA CAC 2282 
Asn Phe Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His 
625 630 635 

45 AGG CTT GAA GCT GCA TGC AAT TGG ACC CGA GGA GAG CGT TGT GAC TTG 2330 
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Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 

640 645 650 

GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTA TTG CTG TCC ACA ACA 2378 
Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr 
655 660 665 670 

GAG TGG CAG GTA CTG CCC TGT TCC TTC ACC ACC CTG CCG GCT CTG TCC 2426 
Glu Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser 

675 680 685 

ACT GGT TTG ATT CAT CTC CAT CAG AAC ATC GTG GAC GTG CAA TAT CTG 2474 
Thr Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu 

690 695 700 

TAC GGC ATA GGG TCG GCG GTT GTC TCC TTC GCA ATC AAA TGG GAA TAT 2522 
Tyr Gly He Gly Ser Ala Val Val Ser Phe Ala He Lys Trp Glu Tyr 

705 710 715 

ATT CTG TTG CTT TTC CTC CCC CTG GCG GAC GCG CGC GTC TGT GCC TGG 2570 
He Leu Leu Leu Phe Leu Pro Leu Ala Asp Ala Arg Val Cys Ala Trp 

720 725 730 

TTG TGG ATG ATG CTG CTG ATA GCC CAA GCT GAG GCC GCC TTG GAG AAC 2618 
Leu Trp Met Met Leu Leu He Ala Gin Ala Glu Ala Ala Leu Glu Asn 
735 740 745 750 

CTG GTG GTC CTC AAT GCA GCA TCC ATG GCG GGA GCG CAT GGC ATC CTC 2666 
Leu Val Val Leu Asn Ala Ala Ser Met Ala Gly Ala His Gly He Leu 

755 760 765 

TCT TTC CTT GTG TTC TTC TGT GCC GCC TGG TAC ATC AAA GGC AGG CTG 2714 
Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr He Lys Gly Arg Leu 

770 775 780 

GTC CCT GGG GCG GCA TAC GCT TTC TAT GGC GTA TGG CCG CTG CTC CTG 2762 
Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro Leu Leu Leu 

785 790 795 

CTC TTG ATG GCG CTA CCC GCA CGG GCG TAC GCC ATG GAC CGG GAG ATG 2810 
Leu Leu Met Ala Leu Pro Ala Arg Ala Tyr Ala Met Asp Arg Glu Met 

800 805 810 

GCT GCA TCG TGC GGA GGC GCG GTT TTT GTA GGT CTG GTA CTC TTG ACC 2858 
Ala Ala Ser Cys Gly Gly Ala Val Phe Val Gly Leu Val Leu Leu Thr 
815 820 825 830 

TTG TCA CCA TAC TAC AAA GTG TTC CTC GCT AAG CTC ATA TGG TGG TTG 2906 
Leu Ser Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu He Trp Trp Leu 
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835 840 845 

CAA TAT CTC ATC ACC AGG GCC GAG GCG CAC TTG CAA GTG TGG ATC CCC 2954 
Gin Tyr Leu lie Thr Arg Ala Glu Ala His Leu Gin Val Trp lie Pro 

850 855 860 

CCC CTC AAC GTT CGG GGG GGC CGC GAT GCC ATC ATC CTT CTC ACA TGT 3002 
Pro Leu Asn Val Arg Gly Gly Arg Asp Ala lie lie Leu Leu Thr Cys 

865 870 875 

GCG GTC CAC CCG GAG CTG ATC TTT GAC ATC ACC AAG CTC TTG CTC GCC 3050 
Ala Val His Pro Glu Leu lie Phe Asp lie Thr Lys Leu Leu Leu Ala 

880 885 890 

ATA CTC GGT CCG CTC ATG GTA CTC CAG GCT GGC CTA ACC CAA ATG CCG 3098 
75 He Leu Gly Pro Leu Met Val Leu Gin Ala Gly Leu Thr Gin Met Pro 

895 900 905 910 

TAC TTT GTG CGT GCT CAA GGG CTC ATT CGT ATG TGC ATG TTG GTG CGG 3146 
Tyr Phe Val Arg Ala Gin Gly Leu lie Arg Met Cys Met Leu Val Arg 
20 915 920 925 

AAA GCC GCT GGG GGT CAT TAT GTC CAG ATG GCT CTC ATG AAG CTG GCT 3194 
Lys Ala Ala Gly Gly His Tyr Val Gin Met Ala Leu Met Lys Leu Ala 

930 935 940 

25 GCA CTG ACA GGT ACG TAC GTT TAT GAC CAT CTT ACT CCA CTG CAG GAC 3242 

Ala Leu Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro Leu Gin Asp 

945 950 955 

TGG GCC CAC GCG GGC CTA CGA GAC CTT GCG GTA GCA GTT GAG CCC GTT 3290 
Trp Ala His Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val 

960 965 970 

GTC TTC TCT GAT ATG GAG ACT AAG ATC ATC ACG TGG GGG GCA GAG ACG 3338 
Val Phe Ser Asp Met Glu Thr Lys lie He Thr Trp Gly Ala Glu Thr 
975 980 985 990 

GCG GCG TGT GGG GAC ATC ATC TCG AGT CTA CCC GTT TCC GCC CGA AGG 3386 
Ala Ala Cys Gly Asp He lie Ser Ser Leu Pro Val Ser Ala Arg Arg 

995 1000 1005 

GGG AGG GAG CTG CTT TTG GGA CCG GCC GAT AGT TTT GAC GGG CAG GGG 3434 
Gly Arg Glu Leu Leu Leu Gly Pro Ala Asp Ser Phe Asp Gly Gin Gly 

1010 1015 1020 

TGG CGA CTC CTT GCG CCT ATC ACG GCC TAC TCC CAG CAG ACG CGG GGC 3482 
Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly 
45 1025 1030 1035 
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CTG CTT GGT TGC ATC ATC ACC AGC CTT ACG GGC CGG GAT AAG AAC CAG 3530 
Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin 

1040 1045 1050 

GTC GAG GGG GAG GTT CAA GTG GTC TCT ACC GCA ACA CAA TCT TTC CTG 3578 
Val Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu 
1055 1060 1065 1070 

GCG ACC TGC ATC AAC GGC GTT TGC TGG ACT GTT TTC CAC GGC GCC GGC 3626 
Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Phe His Gly Ala Gly 

1075 1080 1085 

TCG AAG ACC TTA GCC GGC CCA AAA GGC CCA ATC ACC CAA ATG TAG ACC 3674 
Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr 

1090 1095 1100 

AAT GTA GAT CAG GAC CTC GTC GGC TGG TCG GCG CCC CCC GGG GCG CGT 3722 
Asn Val Asp Gin Asp Leu Val Gly Trp Ser Ala Pro Pro Gly Ala Arg 

1105 mo ins 

TCC TTG ACA CCT TGC ACC TGC GGC AGC TCG GAC CTT TAT TTG GTC ACG 3770 
Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr 

1120 H25 1130 

AGG CAT GCT GAT GTC ATT CCG GTG CAC CGG CGG GGC GAC AGC AGG GGG 3818 
Arg His Ala Asp Val He Pro Val His Arg Arg Gly Asp Ser Arg Gly 
1135 H40 H45 H50 

AGC CTC CTC TCC CCC GGG CCC ATC TCT TAC TTG AAG GGT TCC TCG GGT 3866 
Ser Leu Leu Ser Pro Gly Pro He Ser Tyr Leu Lys Gly Ser Ser Gly 

1155 1160 H65 

GGT CCG CTG CCT TGC CCC TCG GGC CGT GTT GTG GGC ATC TTC CGG GCT 3914 
Gly Pro Leu Pro Cys Pro Ser Gly Arg Val Val Gly He Phe Arg Ala 

1170 H75 1180 

GCC GTG TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTT 3962 
Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val 

1185 H90 1195 

GAG TCT ATG GAA ACC ACC ATG CGG TCT CCG GTC TTC ACG GAT AAC TCA 4010 
Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser 
40 1200 1205 1210 

ACC CCC CCG GCC GTA CCG CAG ACA TTC CAA GTG GCC CAC CTA CAC GCT 4058 
Thr Pro Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala 
1215 1220 1225 1230 

45 CCC ACT GGC AGC GGC AAA AGC ACC AGG GTG CCG GCT GCG TAT GCG GCC 4106 
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Pro Thr Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala 

1235 1240 1245 

CAA GGG TAC AAG GTA CTC GTC CTG AAC CCG TCC GTT GCT GCC ACT TTG 4154 
5 Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu 

1250 1255 1260 

GGC TTT GGG GCG TAC ATG TCC AAG GCA CAT GGT GTT GAC CCT AAC ATC 4202 
Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie 
w 1265 1270 1275 

AGA ACT GGG GTG AGG ACC ATC ACC ACG GGC GCT CCC ATC ACG TAC TCC 4250 
Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ala Pro lie Thr Tyr Ser 

1280 1285 1290 

ACC TAC GGT AAG TTC CTC GCC GAC GGT GGC TGT TCT GGG GGT GCC TAT 4298 
Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr 
1295 1300 1305 1310 

GAC ATC ATA ATA TGT GAT GAG TGT CAT TCA ACT GAC TCG ACT TCC ATC 4346 
Asp lie He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser He 

1315 1320 1325 

TTG GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGC 4394 
Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg 

1330 1335 1340 

CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG 4442 
Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro 

1345 1350 1355 

CAT CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG ATC CCC 4490 
His Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro 

1360 1365 1370 

TTC TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG AGG CAT 4538 
Phe Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys. Gly Gly Arg His 
1375 1380 1385 1390 

CTC ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT GCG AAG 4586 
Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys 

1395 1400 1405 

40 CTG TCG GCC CTC GGA GTC AAC GCT GTA GCA TAT TAC CGG GGT CTT GAT 4634 

Leu Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp 

1410 1415 1420 

GTG TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA ACA GAC 4682 
Val Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp 
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1425 1430 1435 

GCT CTA ATG ACA GGC TAT ACC GGT GAC TTT GAC TCG GTG ATC GAC TGC 4730 
Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys 

1440 1445 1450 

AAC ACA TGT GTC ACC CAA ACA GTC GAT TTC AGC TTG GAC CCT ACT TTC 4778 
Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe 
!455 1460 1465 1470 

ACC ATC GAG ACG ACG ACC GTA CCC CAA GAT GCG GTG TCG CGC TCG CAG 4826 
Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin 

1475 1480 1485 

CGG CGA GGC AGG ACT GGT AGG GGC AGG GGG GGC ATA TAC AGG TTT GTA 4874 
15 Arg Arg Gly Arg Thr Gly Arg Gly Arg Gly Gly He Tyr Arg Phe Val 

1490 1495 1500 

ACT CCA GGG GAA CGG CCC TCA GGC ATG TTC GAT TCT TCG GTC CTG TGT 4922 
Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys 
20 15 °5 1510 1515 

GAA TGT TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACG CCC GCC GAG 4970 
Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu 
1520 1525 1530 

25 ACC TCG GTT AGG TTG CGG GCT TAC CTA AAT ACA CCT GGG CTG CCC GTC 5018 

Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val 
1535 1540 1545 1550 

TGC CAG GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACC GGC CTC ACC 5066 
Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr 

1555 1560 1565 

CAC ATA GAT GCC CAC TTC TTG TCC CAG ACC AAA CAG GCA GGA GAC AAC 5114 
His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn 

1570 1575 1580 

TTC CCC TAC CTG GTA GCA TAC CAG GCT ACA GTG TGC GCC AGG GCC AAG 5162 
Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Lys 

1585 1590 1595 

GCT CCA CCT CCA TCG TGG GAT CAG ATG TGG AAG TGT CTC ATA CGG CTG 5210 
Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu 

1600 1605 1610 

AAG CCT ACG CTA CAC GGG CCA ACG CCC CTG TTG TAT AGG TTA GGA GCC 5258 
Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala 
45 1615 1620 1625 1630 
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GTT CAG AAC GAG GTT ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC ATG 5306 
Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Phe lie Met 

1635 1640 1645 

GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACT TGG GTG CTG 5354 
Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu 

1650 1655 1660 

GTA GGC GGG GTC CTC GCG GCT CTG GCC GCG TAC TGC CTG ACA ACG GGC 5402 
Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly 

1665 1670 1675 

AGC GTG GTC ATT GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCC GTT 5450 
Ser Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala Val 

1680 1685 1690 

ATT CCC GAC AGG GAA GTT CTC TAC CAA GAG TTC GAT GAA ATG GAA GAG 5498 
He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu 
1695 1700 1705 1710 

TGC GCC TCG CAC CTC CCT TAC ATC GAA CAA GGA ATG CAG CTC GCC GAG 5546 
Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu 

1715 1720 1725 

CAA TTC AAG CAG AAG GCG CTC GGT TTG CTG CAA ACA GCC ACC AAG CAA 5594 
Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin 

1730 1735 1740 

GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG 5642 
Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu 

1745 1750 1755 

ACC TTC TGG GCG AAG CAC ATG TGG AAT TTT ATC AGC GGG ATA CAG TAC 5690 
Thr Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly He Gin Tyr 

1760 1765 1770 

TTA GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCA ATA GCA TCA CTG 5738 
Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu 
1775 1780 1785 1790 

ATG GCA TTC ACA GCC TCT ATC ACC AGC CCG CTC ACC ACC CAA TAT ACC 5786 
Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Tyr Thr 
40 1795 1800 1805 

CTC CTG TTT AAC ATC TTG GGG GGA TGG GTG GCC GCC CAA CTC GCC CCC 5834 
Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro 

1810 1815 1820 

45 CCC AGT GCC GCT TCA GCC TTC GTG GGC GCC GGT ATA GCT GGC GCG GCT 5882 



25 



30 



35 



50 



55 



268 



EP 0 518 313 A2 



20 



Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly lie Ala Gly Ala Ala 

1825 1830 1835 

GTT GGC AGC ATA GGC CTC GGG AAG GTG CTT GTG GAC ATT CTG GCG GGT 5930 
5 Val Gly Ser lie Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly 

1840 1845 1850 

TAT GGA GCA GGG GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC 5978 
Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser 
70 1855 1860 1865 1870 

GGT GAC ATG CCC TCC ACC GAG GAC CTG GTC AAC TTA CTC CCC GCC ATC 6026 
Gly Asp Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie 

1875 1880 1885 

15 CTC TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCA ATA CTG 6074 

Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu 

1890 1895 1900 

CGT CGG CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG 6122 
Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg 

1905 1910 1915 

CTG ATA GCG TTT GCT TCG CGG GGC AAC CAT GTC TCC CCC ACG CAC TAT 6170 
Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr 

1920 1925 1930 

GTG CCT GAA AGC GAC GCC GCA GCG CGC GTC ACC CAG ATC CTC TCC AAC 6218 
Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Asn 
1935 1940 1945 1950 

CTT ACC ATC ACT CAG CTG TTG AAG AGG CTT CAC CAG TGG ATT AAT GAG 6266 
Leu Thr He Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu 

1955 1960 1965 

GAC TGC TCC ACG CCA TGC TCC GGC TCG TGG CTC AGG GAT GTT TGG GAC 6314 
Asp Cys Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp 

1970 1975 1980 

TGG ATA TGC ACG GTA TTG GCT GAT TTC AAG ACC TGG CTC CAG TCC AAG 6362 
Trp He Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gin Ser Lys 

1985 1990 1995 

CTC CTG CCG CGG TTA CCG GGG GTC CCT TTT TTC TCA TGC CAG CGT GGG 6410 
Leu Leu Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly 

2000 2005 2010 

TAC AAG GGG GTT TGG CGG GGA GAT GGC ATC ATG TAT ACC ACC TGC CCA 6458 
Tyr Lys Gly Val Trp Arg Gly Asp Gly He Met Tyr Thr Thr Cys Pro 
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2015 2020 2025 2030 

TGT GGA GCA CAA ATC ACC GGA CAT GTC AAA AAC GGT TCT ATG AGG ATC 6506 
Cys Gly Ala Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He 

2035 2040 2045 

GTT GGG CCT AGA ACC TGT AGC AAC ACG TGG CAC GGA ACA TTT CCC ATC 6554 
Val Gly Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He 

2050 2055 2060 

AAC GCG TAC ACC ACA GGC CCC TGC ACA CCC TCC CCG GCG CCA AAC TAT 6602 
Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr 

2065 2070 2075 

TCC AGG GCG TTG TGG CGG GTG GCC GCT GAG GAG TAT GTG GAG GTC ACG 6650 
Ser Arg Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr 

2080 2085 2090 

CGG GTG GGG GAT TTC CAC TAC GTG ACG GGC ATG ACC ACT GAC AAC GTG 6698 
Arg Val Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val 
2095 2100 2105 2110 

AAA TGC CCA TGC CAG GTT CCG GCC CCC GAA TTC TTC ACA GAA TTG GAT 6746 
Lys Cys Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Leu Asp 

2115 2120 2125 

GGG GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTG CGG 6794 
Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg 

2130 2135 2140 

GAT GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAT ACG GTT GGG TCA 6842 
Asp Glu Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Thr Val Gly Ser 

2145 2150 2155 

CAG CTC CCA TGT GAG CCC GAA CCG GAT GTA ACA GTG GTC ACC TCC ATG 6890 
Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Thr Val Val Thr Ser Met 
35 2160 2165 2170 

CTC ACC GAC CCC TCC CAC ATT ACA GCA GAG GCG GCT AGG CGT AGG CTG 6938 
Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Arg Arg Arg Leu 
2175 2180 2185 2190 

40 GCC AGA GGG TCT CCC CCT TCC TTG GCC AGT TCT TCA GCT AGT CAG TTG 6986 

Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu 

2195 2200 2205 

TCT GCG CTT TCT TTG TAG GCG ACA TGC ACT ACC CAT CAT GGC GCC CCA 7034 
45 Ser Ala Leu Ser Leu StopAla Thr Cys Thr Thr His His Gly Ala Pro 

2210 2215 2220 
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GAC ACT GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG CAG GAG ATG GGC 7082 
Asp Thr Asp Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly 

2225 2230 2235 

GGA AAC ATC ACC CGC GTG GAG TCA GAG AAC AAG ATA GTA ATT CTA GAC 7130 
Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys lie Val lie Leu Asp 

2240 2245 2250 

TCT TTT GAA CCG CTT CGA GCG GAG GAG GAT GAG AGG GAA GTG TCC GTT 7178 
Ser Phe Glu Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val 
2255 2260 2265 2270 

GCG GCG GAG ATC CTG CGG AAG ACC AGG AAA TTC CCC GCA GCG ATG CCC 7226 
Ala Ala Glu lie Leu Arg Lys Thr Arg Lys Phe Pro Ala Ala Met Pro 

2275 2280 2285 

GTA TGG GCA CGC CCG GAC TAG AAC CCA CCA TTA CTA GAG TCT TGG AAG 7274 
Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys 

2290 2295 2300 

AAC CCG GAC TAC GTC CCT CCA GTG GTA CAC GGG TGC CCA TTG CCG CCT 7322 
Asn Pro Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro 

2305 2310 2315 

ACC AAG GCC CCT CCA ATA CCA CCT CCA CGG AGA AAG AGA ACG GTT GTC 7370 
Thr Lys Ala Pro Pro lie Pro Pro Pro Arg Arg Lys Arg Thr Val Val 

2320 2325 2330 

CTG ACA GAA TCC ACC GTG TCC TCT GCC TTG GCG GAG CTT GCT ACA AAG 7418 
Leu Thr Glu Ser Thr Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys 
2335 2340 2345 2350 

ACC TTT GGC AGT TCC GGA TCG TCG GCC GTC GAC AGC GGC ACG GCG ACC 7466 
Thr Phe Gly Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr 

2355 2360 2365 

GGC CCT CCT GAC CAG GCC TCC GCC GAA GGA GAT GCA GGA TCC GAC GCT 7514 
Gly Pro Pro Asp Gin Ala Ser Ala Glu Gly Asp Ala Gly Ser Asp Ala 

2370 2375 2380 

GAG TCG TAC TCC TCC ATG CCC CCC CTT GAG GGA GAG CCG GGG GAC CCC 7562 
Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro 
40 2385 2390 2395 

GAT CTC AAC GAC GGG TCT TGG TCT ACC GTA AGC GAG GAG GCC AGC GAG 7610 
Asp Leu Asn Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu 
2400 2405 2410 

45 GAC GTC GTC TGC TGC TCG ATG TCC TAC ACA TGG ACA GGC GCC TTA ATT 7658 
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Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He 
2415 2420 2425 2430 

ACA CCA TGC GCC GCG GAG GAG AGC AAG CTG CCC ATT AAT GCG CTG AGC 7706 
5 Thr Pro Cys Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser 

2435 2440 2445 

AAC CCT TTG CTG CGC CAC CAC AAC ATG GTC TAT GCC ACA ACA TCC CGC 7754 
Asn Pro Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg 
10 2450 2455 2460 

AGC GCA AGC CAG CGG CAG AAA AAG GTC ACA TTT GAG AGA CTG CAA GTC 7802 
Ser Ala Ser Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val 
2465 2470 2475 

75 CTG GAT GAC CAC TAC CGG GAC GTG CTG AAG GAC ATG AAG GCC AAG GCG 7850 

Leu Asp Asp His Tyr Arg Asp Val Leu Lys Asp Met Lys Ala Lys Ala 

2480 2485 2490 

TCC ACA GTT AAG GCT AAA CTT CTA TCT GTA GAG GAA GCC TGC AAG CTG 7898 
2Q Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu 

2495 2500 2505 2510 

ACG CCC CCA CAC T 7911 



SEQ ID NO: 102 

SEQUENCE LENGTH: 1123 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 

CLONE: CN23 



AAGCTT ATG AGC ACA AAT CCA AAA CCC CAA AGA AAA ACC AAA CGT AAC 48 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
1 5 10 

ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG ATC 96 
Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 
15 20 25 30 

GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT GTG 144 
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Val Gly Gly Val 



w 
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CGC GCG 
Arg Ala 

CAA CCT 
Gin Pro 

CCC GGG 
Pro Gly 
80 

GGA TGG 
Gly Trp 

95 
CCC GTT 
Pro Val 

AAC TCA 
Asn Ser 

CAC GCT 
His Ala 

GCG GCC 
Ala Ala 
160 
ACT TTG 
Thr Leu 
175 

AAC ATC 
Asn lie 

TAC TCC 
Tyr Ser 

GCC TAT 
Ala Tyr 



ACT AGG 
Thr Arg 
50 

ATC CCC 
lie Pro 

65 
TAC CCT 
Tyr Pro 

CTC CTG 
Leu Leu 

GAG TCT 
Glu Ser 

ACC CCC 
Thr Pro 
130 
CCC ACT 
Pro Thr 
145 

CAA GGG 
Gin Gly 

GGC TTT 
Gly Phe 

AGA ACT 
Arg Thr 

ACC TAC 
Thr Tyr 
210 
GAC ATC 
Asp lie 



Tyr Leu Leu 
35 

AAG ACT TCC 
Lys Thr Ser 

AAG GCT CGC 
Lys Ala Arg 

TGG CCC CTC 
Trp Pro Leu 
85 

TCA CCC CGG 
Ser Pro Arg 

100 
ATG GAA ACC 
Met Glu Thr 
115 

CCG GCC GTA 
Pro Ala Val 

GGC AGC GGC 
Gly Ser Gly 

TAC AAG GTA 
Tyr Lys Val 
165 

GGG GCG TAC 

Gly Ala Tyr 

180 
GGG GTG AGG 
Gly Val Arg 
195 

GGT AAG TTC 
Gly Lys Phe 

ATA ATA TGT 
lie lie Cys 



Pro Arg Arg 
40 

GAG CGG TCG 
Glu Arg Ser 
55 

CAA CCC GAG 
Gin Pro Glu 
70 

TAT GGC AAT 
Tyr Gly Asn 

GGG GTT GCG 
Gly Val Ala 



Gly Pro Arg Leu Gly Val 

45 

GGA AGG CGA 
Gly Arg Arg 
60 

TGG GCT CAG 
Trp Ala Gin 



CAA CCT CGT 
Gin Pro Arg 



ACC ATG 
Thr Met 

CCG CAG 
Pro Gin 
135 
AAA AGC 
Lys Ser 
150 

CTC GTC 
Leu Val 



CGG 
Arg 
120 
ACA 
Thr 

ACC 
Thr 

CTG 
Leu 



ATG TCC AAG 
Met Ser Lys 

ACC ATC ACC 
Thr lie Thr 
200 

CTC GCC GAC 
Leu Ala Asp 

215 
GAT GAG TGT 
Asp Glu Cys 



GGC AGG GCC 
Gly Arg Ala 
75 

GAG GGC TTG 
Glu Gly Leu 
90 

AAG GCG GTG 
Lys Ala Val 
105 

TCT CCG GTC 
Ser Pro Val 

TTC CAA GTG 
Phe Gin Val 

AGG GTG CCG 
Arg Val Pro 
155 

AAC CCG TCC 
Asn Pro Ser 

170 
GCA CAT GGT 
Ala His Gly 
185 

ACG GGC GCT 
Thr Gly Ala 

GGT GGC TGT 
Gly Gly Cys 

CAT TCA ACT 

His Ser Thr 



GGG TGG GCA 
Gly Trp Ala 

GAC TTT GTG 
Asp Phe Val 
110 

TTC ACG GAT 
Phe Thr Asp 
125 

GCC CAC CTA 
Ala His Leu 
140 

GCT GCG TAT 
Ala Ala Tyr 

GTT GCT GCC 
Val Ala Ala 

GTT GAC CCT 
Val Asp Pro 
190 

CCC ATC ACG 
Pro lie Thr 
205 

TCT GGG GGT 
Ser Gly Gly 
220 

GAC TCG ACT 

Asp Ser Thr 
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225 230 235 

TCC ATC TTG GGC ATT GGT ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA 768 
Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 

240 245 250 

GCG CGC CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC 816 
Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
25 5 260 265 270 

GTG CCG CAT CCT AAT ATT GAG GAG GTG GCC TTG TCC AAC ACT GGA GAG 864 
Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu 

275 280 285 

ATC CCC TTC TAT GGC AAG GCC ATC CCC CTC GAG GCC ATC AAG GGG GGG 912 
He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Ala He Lys Gly Gly 

290 295 300 

AGG CAT CTC ATT TTC TGC CAT TCC AAG AAG AAA TGT GAC GAG CTC GCT 960 
Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
20 305 310 315 

GCG AAG CTG TCG GCC CTC GGA GTC AAC GCT GTA GCA TAT TAC CGG GGT 1008 
Ala Lys Leu Ser Ala Leu Gly Val Asn Ala Val Ala Tyr Tyr Arg Gly 
320 325 330 

25 CTT GAT GTG TCC ATC ATA CCG ACA AGC GGG GAC GTC GTT GTC GTG GCA 1056 

Leu Asp Val Ser He He Pro Thr Ser Gly Asp Val Val Val Val Ala 
^35 340 345 350 

ACA GAC GCT CTA ATG ACG GGC TAT ACC GGT GAC TTT GAC TCG GTG ATC 1104 
30 Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He 

355 360 365 

GAC TGC AAC ACA TGA TAA AGATCT 1128 
Asp Cys Asn Thr StopStop 
35 370 

SEQ ID NO: 103 

SEQUENCE LENGTH: 974 base pairs 
40 SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 
45 ORGANISM: Hepatitis C virus 

IMMEDIATE EXPERIMENTAL SOURCE 
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CLONE: 015-1 

GCGGATCCT CCA CCT CCA TCG TGG GAC CAA ATG TGG AAG TGT CTC ATA CGG 51 

5 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg 
15 10 
CTG AAA CCT ACG CTA CAC GGG CCA ACA CCC CTG TTG TAT AGG TTA GGA 99 
70 Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly 

15 20 25 30 

GCC GTT CAA AAC GAG GTC ACC CTC ACA CAC CCC ATA ACC AAA TTC ATC 147 
Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Phe lie 

35 40 45 

ATG GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACC TGG GTG 195 
Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val 

50 55 60 

CTG GTA GGC GGG GTC CTC GCA GCT CTG GCC GCG TAC TGC CTG ACA ACG 243 
Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr 

65 70 75 

GGC AGC GTG GTC ATC GTG GGC AGG ATC ATC TTG TCC GGG AGG CCG GCT 291 
Gly Ser Val Val lie Val Gly Arg lie He Leu Ser Gly Arg Pro Ala 

80 85 90 

ATC ATT CCC GAC AGG GAA GTT CTC TAC CGT GAG TTC GAT GAA ATG GAG 339 
He He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu 
95 100 105 110 

GAG TGC GCC TCA CAC CTC CCC TAC ATC GAA GAG GGA ATG CAG CTC GCC 387 
Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala 

115 120 125 

GAG CAG TTC AAG CAG AAG GCG CTC GGT TTG TTG CAA ACA GCT ACC CAG 435 
Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Gin 

130 135 140 

CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAA TGG CGA GCC CTA 483 
Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu 

145 150 155 

GAG GCC TTC TGG GCA AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG 531 
Glu Ala Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin 
160 165 170 

45 TAC TTG GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCG ATA GCA TCA 579 
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Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser 
175 180 185 190 

CTG ATG GCA TTC ACA GCC TCT ATC ACC AGC CCT CTC ACC ACC CAA CAT 627 
Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin His 

195 200 205 

ACC CTC CTG TTT AAC ATC TTG GGG GGA TGG GTA GCC GCC CAA CTC GCC 675 
Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala 

210 215 220 

CCT CCC AGC GCT GCT TCA GCT TTT GTG GGC GCC GGC ATA GCT GGC GCG 723 
Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala 

225 230 235 

GCT GTT GGC AGC ATA GGC CTT GGG AAG GTG CTT GTG GAC ATC CTG GCG 771 
Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala 

240 245 250 

GGT TAT GGA GCA GGG GTG GCA GGC GCA CTC GTG GCC TTT AAG GTC ATG 819 
Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met 
255 260 265 270 

AGT GGC GAG ATG CCC TCC ACC GAG GAC TTG GTC AAC CTA CTC CCT GCC 867 
Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 

275 280 285 

ATC CTC TCT CCT GGC GCC CTG GTC GTC GGA GTC GTG TGC GCA GCA ATA 915 
He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He 

290 295 300 

CTG CGT CGA CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC 963 
Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn 

305 310 315 

CGG CTG C AGCC 974 
Arg Leu 
320 



SEQ ID NO: 104 

SEQUENCE LENGTH: 974 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

ANTI-SENSE: No 

ORGANISM: Hepatitis C virus 
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IMMEDIATE EXPERIMENTAL SOURCE 
CLONE: 015-2 

5 GCGGATCCT CCA CCT CCA TCG TGG GAC CAA ATG TGG AAG TGT CTC ATA CGG 51 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg 
15 10 
10 CTA AAA CCT ACG CTA CAC GGG CCA ACA CCC CTG TTG TAT AGG TTA GGA 99 

Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly 
15 20 25 30 

GCC GTT CAA AAC GAG GTC ACC CTC ACA CAC CCC ATA ACC AAG TTC ATC 147 
Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Phe He 

35 40 45 

ATG GCA TGC ATG TCG GCT GAC CTA GAG GTC GTC ACT AGC ACC TGG GTG 195 
Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val 

50 55 60 

CTG GTA GGC GGG GTC CTC GCA GCT CTG GCC GCG TAG TGC CTG ACA ACG 243 
Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr 

65 70 75 

GGC AGC GTG GTC ATC GTG GGC AGA ATC ATC TTG TCC GGG AGG CCG GCT 291 
Gly Ser Val Val He Val Gly Arg He He Leu Ser Gly Arg Pro Ala 

80 85 90 

ATC ATT CCC GAC AGG GAG GTT CTC TAC CGG GAG TTC GAT GAA ATG GAG 339 
He He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu 
95 100 105 110 

GAG TGC GCC TCA CAC CTC CCC TAC ATC GAA CAG GGA ATG CAG CTC GCC 387 
Glu Cys Ala Ser His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala 

115 120 125 

GAG CAA TTC AAG CAG AAG GCG CTC GGT TTG TTG CAA ACA GCT ACC AAG 435 
Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys 

130 135 140 

CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAA TGG CGA GCC CTT 483 
Gin Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu 

145 150 155 

GAG ACC TTC TGG GCA AAG CAC ATG TGG AAT TTC ATC AGC GGG ATA CAG 531 
Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin 
45 160 165 170 
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TAC TTG GCA GGC TTG TCC ACT CTG CCT GGA AAC CCC GCG ATA GCA TCA 579 
Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser 
175 180 185 190 

CTG ATG GCA TTC ACA GCC TCT ATC ACC AGC CCT CTC ACC ACC CAA CAT 627 
Leu Met Ala Phe Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin His 

195 200 205 

ACC CTC CTG TTT AAC ATC TTT GGG GGA TGG GTG GCC GCC CAA CTC GCC 675 
Thr Leu Leu Phe Asn He Phe Gly Gly Trp Val Ala Ala Gin Leu Ala 

210 215 220 

CCT CCC AGC GCT GCT TCA GCT TTT GTG GGC GCC GGC ATA GCT GGC GCG 723 
Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala 
75 225 230 . 235 

GCT GTT GGC AGC ATA GGC CTT GGG AAG GTG CTT GTG GAC ATC CTG GCG 771 
Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala 
240 245 250 

20 GGT TAT GGA GCA GGG GTG GCA GGC GCA CTC GTG GCC TTT AAG GTC ATG 819 

Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met 
255 260 265 270 

AGT GGC GAG ATG CCC TCC ACC GAG GAC TTG GTC AAC TTA CTC CCT GCC 867 
25 Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 

275 280 285 

ATC CTC TCT CCT GGC GCC CTG GTC GTC GGA GTC GTG TGC GCA GCA ATA 915 
He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He 

290 295 300 

CTG CGT CGA CAT GTG GGC CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC 963 
Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn 

305 310 315 

CGG CTG C AGCC 974 
Arg Leu 
320 
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SEQ ID NO: 105 

SEQUENCE LENGTH: 19 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 

CTCCACCATAGATCACTCC 19 
SEQ ID NO: 106 

SEQUENCE LENGTH: 18 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

AGGTCTAGTAGACCGTGC 1 8 

SEQ ID NO: 107 

SEQUENCE LENGTH: 18 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

AGGAAGACTTCCGAGCGG 1 8 

SEQ ID NO: 108 

SEQUENCE LENGTH: 19 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CGTGAACTATGCAACAGGG 1 9 

SEQ ID NO: 109 

SEQUENCE LENGTH: 18 base pairs 
SEQUENCE TYPE: nucleic acid 
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TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 

ACCGCTCGGAAGTCTTCC 18 

SEQ ID NO: 110 

SEQUENCE LENGTH: 18 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGGCAAGTTCCCTGTTGC 18 
SEQ ID NO: 111 

SEQUENCE LENGTH: 18 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCTGGATTCTCTGAGACG 18 
SEQ ID NO: 112 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GAGGCCGTGAACTGCGATGA 23 
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SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TTCTCTAAGGTGGCNTCNGCNTG 23 

N : inosine 

SEQ ID NO: 114 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CCGGACGCGTTGAANCTNGNGT 21 

N: inosine 
SEQ ID NO: 115 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CATCCAGGTACAACCGAACCA 23 
SEQ ID NO: 116 

SEQUENCE LENGTH: 24 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
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ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

AACACACGGCCGCCNCANGGNAA 2 4 
N: inosine 

SEQ ID NO: 117 

SEQUENCE LENGTH: 19 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CCGGATCCCACAAGCCGTNGTNGA 1 9 
N: inosine 

SEQ ID NO: 118 

SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GACATGCATGTC ATGATGTA 2 0 
SEQ ID NO: 119 

SEQUENCE LENGTH: 26 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCTGCAGCCGGTTCATCCACTGCAC 2 ( 
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SEQ ID NO: 120 

SEQUENCE LENGTH: 26 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGGATCCTGCTTCGCCCAGAAGGTC 26 
SEQ ID NO: 121 

SEQUENCE LENGTH: 22 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GACACATGTGTTGCAGTCGATC 22 
SEQ ID NO: 122 

SEQUENCE LENGTH: 24 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CGGTCCNAGNAGTATCTCNTTNCC 24 

N: inosine 
SEQ ID NO: 123 

SEQUENCE LENGTH: 35 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 

ATGGGCCCGGGNGANAGNAGNCTCCCCCTNCTNTC 
N: inosine 

SEQ ID NO: 124 

SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCTATACCGGCGACTTCGA 20 

SEQ ID NO: 125 

SEQUENCE LENGTH: 27 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGGATCCGGCCTCACCCACATAGATG 2 7 
SEQ ID NO: 126 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGGATCCTCCACCTCCATCGTG 2 3 
SEQ ID NO: 127 

SEQUENCE LENGTH: 20 base pairs 
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SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 

CTGCTGTCGCCCNGNCCCAT 2 0 
N: inosine 

SEQ ID NO: 128 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

ATCACGTGGGGNGCAGANACNGC 2 3 
N: inosine 

SEQ ID NO: 129 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TGTGCCTGNTTNTGGATGATG 2 1 
N: inosine 

SEQ ID NO: 130 

SEQUENCE LENGTH: 21 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 

GGTGAGCATGGAGGTGACCAC 21 
SEQ ID NO: 131 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TCATCCTCCTCCGCTCGAAGC 2 1 
SEQ ID NO: 132 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GTGGACGCCTTNGCCTTCATNTC 2 3 
N: inosine 

SEQ ID NO: 133 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

ACGGATGTCNTTCTCNGTNAC 2 1 
N: inosine 

SEQ ID NO: 134 
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SEQUENCE LENGTH: 30 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE : cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCGGAATTCCTGGTCATAGCCTCCGTGAA 3 0 
SEQ ID NO: 135 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGGGNATGGCCTATTGGCCTG 2 1 
N: inosine 

SEQ ID NO: 136 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCATGTGGGCCCAGGGGAGG 2 1 
SEQ ID NO: 137 

SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
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TGTGAGCCCGAACCGGATGT 2 0 
SEQ ID NO: 138 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GTGGTANTCCTGGACTCNTTNGA 2 3 
N: inosine 

SEQ ID NO: 139 

SEQUENCE LENGTH: 22 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

ACTACCGNGACGTGCTNAANGA 2 2 
N: inosine 

SEQ ID NO: 140 

SEQUENCE LENGTH: 30 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TGGGGATCCCGTATGATACCCGCTGCTTTG 
SEQ ID NO: 141 

SEQUENCE LENGTH: 24 base pairs 
SEQUENCE TYPE: nucleic acid 
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TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 

ATTGTCAGATCTACGGGGCCACTT 2 4 
SEQ ID NO: 142 

SEQUENCE LENGTH: 43 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTAAAAAAAAAAAAGGGGGATGGCCTATTGGCCTGGA 
SEQ ID NO: 143 

SEQUENCE LENGTH: 17 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GTAAAACGACGGCCAGT 1 7 
SEQ ID NO: 144 

SEQUENCE LENGTH: 17 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CAGGAAACAGCTATGAC 1 7 
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SEQ ID NO: 145 

SEQUENCE LENGTH: 35 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGAGCACAAATCCAAAACCCCAAAGA 3 5 
SEQ ID NO: 146 

SEQUENCE LENGTH: 38 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCACCTACGCCGGGGGTCCGTGGG 
SEQ ID NO: 147 

SEQUENCE LENGTH: 39 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCAGATTCTCTGAGACGGCCCTCGT 
SEQ ID NO: 148 

SEQUENCE LENGTH: 17 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
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GCTACTCCGGATACCAC 17 
SEQ ID NO: 149 

SEQUENCE LENGTH: 35 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGTCGACGCTAGCATGAGCACAAATCCAAAACCC 3 5 
SEQ ID NO: 150 

SEQUENCE LENGTH: 35 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGTCGACGCTAGCAGGTCTCGTAGACCGTGCATC 3 5 
SEQ ID NO: 151 

SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCGCTAGCTCAGGATTCTCTGAGACGGCCCTCGA 
SEQ ID NO: 152 

SEQUENCE LENGTH: 35 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
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ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGCGGATCCCACAAGCCGTGGTGGAT 35 
SEQ ID NO: 153 

SEQUENCE LENGTH: 24 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CGGATCCCACAAGCCGTGGTGGAT 24 

SEQ ID NO: 154 

SEQUENCE LENGTH: 43 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCACTCTAAGGTGGCGTCGGCGTGGG 
SEQ ID NO: 155 

SEQUENCE LENGTH: 11 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATG 1 1 

SEQ ID NO: 156 
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SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TGATGAAGATCTGAATTCGC 2 0 
SEQ ID NO: 157 

SEQUENCE LENGTH: 34 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGTTCAACGCGTCCGGATGTCCGGA 
SEQ ID NO: 158 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TTCAACGCGTCCGGATGTCCGGA 2 3 
SEQ ID NO: 159 

SEQUENCE LENGTH: 43 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
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GCGAATTCAGATCTTCATCAACAACCGAACCAGTTGCCCTGCG 
SEQ ID NO: 160 

SEQUENCE LENGTH: 34 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGATCGGGGGGGTCGGCAACAATAC 34 
SEQ ID NO: 161 

SEQUENCE LENGTH: 23 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

ATCGGGGGGGTCGGCAACAATAC 23 
SEQ ID NO: 162 

SEQUENCE LENGTH: 43 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCAAAGCTCTGATCTATCCCTGTCCT 
SEQ ID NO: 163 

SEQUENCE LENGTH: 41 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
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ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGTCGACGCTAGCATGCGGATCCCACAAGCCGTGGTGGAT 4 1 
SEQ ID NO: 164 

SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGTCGACGCTAGCATGTTCAACGCGTCCGGATGTCCGGA 4 0 



SEQ ID NO: 165 
25 SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 



GCGTCGACGCTAGCATGATCGGGGGGGTCGGCAACAATAC 4 0 



SEQ ID NO: 166 
<° SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 



55 



GCGAATTCGCTAGCTCACTCTAAGGTGGCGTCGGCGTGGG 4 0 
SEQ ID NO: 167 

SEQUENCE LENGTH: 40 base pairs 
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SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
5 MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 
ORGANISM: Hepatitis C virus 

70 

GCGAATTCGCTAGCTCAACAACCGAACCAGTTGCCCTGCG 4 0 

SEQ ID NO: 168 
»s SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 
20 MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCGCTAGCTCAAAGCTCTGATCTATCCCTGTCCT 40 

SEQ ID NO: 169 
30 SEQUENCE LENGTH: 32 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 
35 MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGTGGTTGTGGATGATGCTGCTG 3 2 

SEQ ID NO: :170 
« SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 
- 0 MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TGGTTGTGGATGATGCTGCTG 21 
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SEQ ID NO: 171 

SEQUENCE LENGTH: 44 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCACCTCCGGGCGGAGACNGGNAGNCC 
N: inosine 

SEQ ID NO: 172 

SEQUENCE LENGTH: 31 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGGGCAACGAGNTNCTNCTNGG 3 1 
N: inosine 

SEQ ID NO: 173 

SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCAACGAGNTNCTNCTNGG 2 0 
N: inosine 

SEQ ID NO: 174 

SEQUENCE LENGTH: 41 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
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MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCACTTCAGCCGTATGAGACACTT 
SEQ ID NO: 175 

SEQUENCE LENGTH: 31 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGCTGTCGCCCGGGCCCATCTC 3 1 
SEQ ID NO: 176 

SEQUENCE LENGTH: 20 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CTGTCGCCCGGGCCCATCTC 2 0 
SEQ ID NO: 177 

SEQUENCE LENGTH: 41 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCAACATGTGTTGCAGTCGATCAC 
SEQ ID NO: 178 
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SEQUENCE LENGTH: 32 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGGGCTATACCGGNGACTTNGAC 3 

N: inosine 
SEQ ID NO: 179 

SEQUENCE LENGTH: 21 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGCTATACCGGNGACTTNGAC 21 
N: inosine 

SEQ ID NO: 180 

SEQUENCE LENGTH: 35 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCAGTGCTTCGC CCAGAAGGT 
SEQ ID NO: 181 

SEQUENCE LENGTH: 29 base pairs 
SEQUENCE TYPE: nucleic acid 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE 
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ORGANISM: Hepatitis C virus 

GCGCTAGCATGTGGTTGTGGATGATGCTG 2 9 
SEQ ID NO: 182 

SEQUENCE LENGTH: 38 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCGCTAGCTCACAGCCGGTTCATCCACTGCAC 38 
SEQ ID NO: 183 

SEQUENCE LENGTH: 32 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGCAGCGTGGGTACAAGGGGGTT 32 
SEQ ID NO: 184 

SEQUENCE LENGTH: 47 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCAGAGCTGTGACCCAACCGTATATTGGTT 
SEQ ID NO: 185 

SEQUENCE LENGTH: 33 base pairs 
SEQUENCE TYPE: nucleic acid 
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TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGCTAGCATGGGGTACAAGGGGGTTTGGCGGG 
SEQ ID NO: 186 

SEQUENCE LENGTH: 32 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGCTAGCTCATCGGTTGGGGAGCAGGTAGAT 
SEQ ID NO: 187 

SEQUENCE LENGTH: 26 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GGATCCCCCCAAGCTTGGGGGAATTC 2 6 
SEQ ID NO: 188 

SEQUENCE LENGTH: 31 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

AGCTTACTAGTTAATACGACTCACTATAGGG 
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SEQ ID NO: 189 

SEQUENCE LENGTH: 33 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CTGGCACCCTATAGTGAGTCGTATTAACTAGTA 3 3 
SEQ ID NO:190 

SEQUENCE LENGTH: 44 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TGCCAGCCCCCTGATGGGGGCGACACTCCACCATAGATCACTCC 44 
SEQ ID NO:191 

SEQUENCE LENGTH: 45 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

TCACAGGGGAGTGATCTATGGTGGAGTGTCGCCCCCATCAGGGGG 45 
SEQ ID NO: 192 

SEQUENCE LENGTH: 40 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 
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CCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTAGC 40 
SEQ ID NO: 193 

SEQUENCE LENGTH: 37 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

CATGGCTAGACGCTTTCTGCGTGAAGACAGTAGTTCC 37 
SEQ ID NO: 194 

SEQUENCE LENGTH: 33 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCAAGCTTATGCTGCTGTCGCCCGGGCCCATCT 33 
SEQ ID NO: 195 

SEQUENCE LENGTH: 38 base pairs 

SEQUENCE TYPE: nucleic acid 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

ORIGINAL SOURCE 

ORGANISM: Hepatitis C virus 

GCGAATTCAGATCTTCATCATGTGTTGCAGTCGATCAC 38 

Claims 

1. An isolated gene encoding a polypeptide originated from h patitis C virus, wherein said polypeptide 
has an amino acid sequence of SEQ ID NO 101. 

2. An isolated gene encoding a polypeptide original d from hepatitis C virus, wherein said polypeptide 
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has an amino acid sequence of SEQ ID NO 102. 



3. An isolated DNA originated from hepatitis C virus, wherein said DNA has a base sequence of SEQ ID 
NO 101. 

5 

4. An isolated DNA originated from hepatitis C virus, wherein said DNA has a base sequence of SEQ ID 
NO 102. 

5. A polypeptide which comprises 115 amino acids from No. 1 to No. 115 of amino acid sequence of SEQ 
70 ID NO 3 or 7. 

6. An isolated DNA which encodes a polypeptide of Claim 127 r 

7. A polypeptide of 10 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
75 32 , wherein said polypeptide comprises at least 6 amino acids from No. 182 to No. 187 of amino acid 

sequence of SEQ ID NO 31 or 32. 

8. A polypeptide of 10 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
32 , wherein said polypeptide comprises at least 8 amino acids from Nos. 202 to 209 of amino acid 

20 sequence of SEQ ID NO 31 or 32. 

9. A polypeptide which comprises 106 amino acids from No. 109 to No. 214 of amino acid sequence of 
SEQ ID NO 31 or 32. 

25 10. A polypeptide which comprises 92 amino acids from No. 233 to No. 324 of amino acid sequence of 
SEQ ID NO 31 or 32. 

11. A polypeptide of 10 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
32 , wherein said polypeptide comprises at least 5 amino acids from No. 252 to No. 256 of amino acid 

30 sequence of SEQ ID NO 31 or 32. 

12. A polypeptide of 10 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
32 , wherein said polypeptide comprises at least 7 amino acids from No. 273 to No. 279 of amino acid 
sequence of SEQ ID NO 31 or 32. 

35 

13. A polypeptide of 10 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
32 . wherein said polypeptide comprises at least 7 amino acids from No. 136 to No. 142 of amino acid 
sequence of SEQ ID NO 31 or 32. 

40 14. A polypeptide of 17 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 31 or 
32 , wherein said polypeptide comprises at least 17 amino acids from No. 53 to No. 69 of amino acid 
sequence of SEQ ID NO 31 or 32. 

15. A polypeptide which comprises all or 266 amino acids from No. 461 to No. 726 of amino acid sequence 
45 of SEQ ID NO 43. 

16. A polypeptide which comprises all or 42 amino acids from No. 963 to No. 1004 of amino acid sequence 
of SEQ ID NO 43. 

so 17. A polypeptide which comprises all or 45 amino acids from No. 283 to No. 327 of amino acid sequence 
of SEQ ID NO 43. 

18. A polypeptide which comprises all or 74 amino acids from No. 477 to No. 550 of amino acid sequence 
of SEQ ID NO 43. 

55 

19. A polypeptide which comprises 61 amino acids from No. 215 to No. 275 of amino acid sequence of 
SEQ ID NO 43. 
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20. A polypeptide which comprises all or 74 amino acids from No. 413 to No. 486 of amino acid sequence 
of SEQ ID NO 75. 

21. A polypeptide which comprises all or 997 amino acids from No. 415 to No. 1411 of amino acid 
5 sequence of SEQ ID NO 75. 

22. A polypeptide which comprises all or 19 amino acids from No. 247 to No. 265 of amino acid sequence 
of SEQ ID NO 75. 

w 23. A polypeptide which comprises all or 74 amino acids from No. 655 to No. 728 of amino acid sequence 
of SEQ ID NO 75. 

24. A polypeptide which comprises all or 54 amino acids from No. 763 to No. 816 of amino acid sequence 
of SEQ ID NO 75. 

75 

25. A polypeptide shown by at least 20 amino acid residues from No. 324 to No. 343 of amino acid 
sequence of SEQ ID NO 75, wherein said polypeptide comprises at least 8 amino acids. 

26. A polypeptide which comprises all or 98 amino acids from No. 858 to No. 955 of amino acid sequence 
20 of SEQ ID NO 75. 

27. A polypeptide of 14 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 75 , 
wherein said polypeptide comprises at feast 14 amino acids from No. 356 to No. 369 of amino acid 
sequence of SEQ ID NO 75. 

25 

28. A polypeptide which comprises all or 92 amino acids from No. 1009 to No. 1100 of amino acid 
sequence of SEQ ID NO 75. 

29. A polypeptide which comprises all or 66 amino acids from No. 1160 to No. 1225 of amino acid 
30 sequence of SEQ ID NO 75. 

30. A polypeptide of 18 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 75 , 
wherein said polypeptide comprises at least 18 amino acids from No. 584 to No. 601 of amino acid 
sequence of SEQ ID NO 75. 

35 

31. A polypeptide which comprises 42 amino acids from No. 615 to No. 656 of amino acid sequence of 
SEQ ID NO 75. 

32. A polypeptide of 1 1 to 25 amino acid residues derived from amino acid sequence of SEQ ID NO 75 , 
40 wherein said polypeptide comprises at least 11 amino acids from No. 326 to No. 337 of amino acid 

sequence of SEQ ID NO 75. 

33. A single-stranded DNA fragment or an antisense DNA fragment thereof which contains at least 15 
nucleotides selected from 317 nucleotides from No. 1 to No. 317 of SEQ ID NO 1, 9, 1 1 or 12. 

45 

34. The DNA fragment of Claim 221 comprising 16 to 30 base pairs. 

35. The DNA fragment of Claim 221 comprising 17 to 23 base pairs. 

so 36. The use of a DNA and/or a polypeptide as claimed in any of the preceding claims for the preparation of 
a vaccine against hepatitis C virus. 

37. The use of a DNA and/or a polypeptide as claimed in many of the preceding claims for the 
serodiagnosis of hepatitis C related diseases. 

55 

38. The use of a DNA as claimed in any of the preceding claims for a in vitro and/or in vivo screening 
system for a substance capable of specifically suppressing or controlling a proteolytic processing of a 
precursor protein of hepatitis C virus. 
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